mirror of
https://github.com/zhayujie/chatgpt-on-wechat.git
synced 2026-06-03 02:27:09 +08:00
Compare commits
45 Commits
feat-cow-c
...
2.0.5
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
830b8f2971 | ||
|
|
b058af122c | ||
|
|
174ee0cafc | ||
|
|
1c336380c0 | ||
|
|
3068880413 | ||
|
|
be596681e5 | ||
|
|
66b71c50e9 | ||
|
|
8744810b25 | ||
|
|
7f94d37c2e | ||
|
|
6d9b7baeb4 | ||
|
|
4470d4c352 | ||
|
|
d2a462a279 | ||
|
|
14ff2a15e7 | ||
|
|
6d1369900e | ||
|
|
1f17ebe69e | ||
|
|
1ae2918064 | ||
|
|
b6571e5cad | ||
|
|
7549d48cf1 | ||
|
|
00353dd0cb | ||
|
|
afd947195d | ||
|
|
e57ef37167 | ||
|
|
ef33a93654 | ||
|
|
61732aecfc | ||
|
|
6764c05c3f | ||
|
|
fa149cf4aa | ||
|
|
e4f9697d06 | ||
|
|
da061450e5 | ||
|
|
d09ae49287 | ||
|
|
511ee0bbaf | ||
|
|
3cb5a0fbd6 | ||
|
|
e06925ab85 | ||
|
|
184634e4e7 | ||
|
|
843c2d02cc | ||
|
|
8ea2455766 | ||
|
|
9dc9987d56 | ||
|
|
3458621147 | ||
|
|
079df5a47c | ||
|
|
ddb07c65a1 | ||
|
|
9b21cd222b | ||
|
|
90f736843f | ||
|
|
13c020eb61 | ||
|
|
dbc06dbe95 | ||
|
|
23d097bc1c | ||
|
|
294e380288 | ||
|
|
4c1c42efac |
1
.gitignore
vendored
1
.gitignore
vendored
@@ -36,6 +36,7 @@ plugins/banwords/lib/__pycache__
|
||||
!plugins/cow_cli
|
||||
client_config.json
|
||||
ref/
|
||||
**/.dev.vars
|
||||
.cursor/
|
||||
local/
|
||||
node_modules/
|
||||
|
||||
67
README.md
67
README.md
@@ -13,6 +13,7 @@
|
||||
<a href="https://cowagent.ai/">🌐 官网</a> ·
|
||||
<a href="https://docs.cowagent.ai/">📖 文档中心</a> ·
|
||||
<a href="https://docs.cowagent.ai/guide/quick-start">🚀 快速开始</a> ·
|
||||
<a href="https://skills.cowagent.ai/">🧩 技能广场</a> ·
|
||||
<a href="https://link-ai.tech/cowagent/create">☁️ 在线体验</a>
|
||||
</p>
|
||||
|
||||
@@ -21,12 +22,14 @@
|
||||
|
||||
> 该项目既是一个可以开箱即用的超级 AI 助理,也是一个支持高扩展的 Agent 框架,可以通过为项目扩展大模型接口、接入渠道、内置工具、Skills 系统来灵活实现各种定制需求。核心能力如下:
|
||||
|
||||
- ✅ **复杂任务规划**:能够理解复杂任务并自主规划执行,持续思考和调用工具直到完成目标,支持通过工具操作访问文件、终端、浏览器、定时任务等系统资源
|
||||
- ✅ **长期记忆:** 自动将对话记忆持久化至本地文件和数据库中,包括全局记忆和天级记忆,支持关键词及向量检索
|
||||
- ✅ **技能系统:** 实现了 Skills 创建和运行的引擎,内置多种技能,并支持通过自然语言对话完成自定义 Skills 开发
|
||||
- ✅ **自主任务规划**:能够理解复杂任务并自主规划执行,持续思考和调用工具直到完成目标
|
||||
- ✅ **长期记忆:** 自动将对话记忆持久化至本地文件和数据库中,包括核心记忆和日级记忆,支持关键词及向量检索
|
||||
- ✅ **技能系统:** Skills 安装和运行的引擎,支持从 [Skill Hub](https://skills.cowagent.ai/)、GitHub 等一键安装技能,或通过对话创造 Skills
|
||||
- ✅ **工具系统:** 内置文件读写、终端执行、浏览器操作、定时任务等工具,Agent 自主调用以完成复杂任务
|
||||
- ✅ **CLI系统:** 提供终端命令和对话命令,支持进程管理、技能安装、配置修改等操作
|
||||
- ✅ **多模态消息:** 支持对文本、图片、语音、文件等多类型消息进行解析、处理、生成、发送等操作
|
||||
- ✅ **多模型接入:** 支持 OpenAI, Claude, Gemini, DeepSeek, MiniMax、GLM、Qwen、Kimi、Doubao 等国内外主流模型厂商
|
||||
- ✅ **多端部署:** 支持运行在本地计算机或服务器,可集成到微信、飞书、钉钉、企业微信、QQ、微信公众号、网页中使用
|
||||
- ✅ **多模型支持:** 支持 OpenAI, Claude, Gemini, DeepSeek, MiniMax、GLM、Qwen、Kimi、Doubao 等国内外主流模型厂商
|
||||
- ✅ **多通道接入:** 支持运行在本地计算机或服务器,可集成到微信、飞书、钉钉、企业微信、QQ、微信公众号、网页中使用
|
||||
|
||||
## 声明
|
||||
|
||||
@@ -66,6 +69,8 @@
|
||||
|
||||
# 🏷 更新日志
|
||||
|
||||
>**2026.04.01:** [2.0.5版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/2.0.5),Cow CLI 命令系统、Skill Hub 开源、浏览器工具、企微扫码创建、多项优化和修复。
|
||||
|
||||
>**2026.03.22:** [2.0.4版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/2.0.4),新增个人微信通道(微信扫码即用)、新增 MiniMax-M2.7 和 GLM-5-Turbo 模型、run.sh 脚本重构、日文文档及多项修复。
|
||||
|
||||
>**2026.03.18:** [2.0.3版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/2.0.3),新增企微智能机器人和 QQ 通道、支持 Coding Plan、新增多个模型、Web 端文件处理、记忆系统升级。
|
||||
@@ -86,11 +91,17 @@
|
||||
|
||||
在终端执行以下命令:
|
||||
|
||||
**Linux / macOS:**
|
||||
```bash
|
||||
bash <(curl -fsSL https://cdn.link-ai.tech/code/cow/run.sh)
|
||||
```
|
||||
|
||||
脚本使用说明:[一键运行脚本](https://docs.cowagent.ai/guide/quick-start)
|
||||
**Windows(PowerShell):**
|
||||
```powershell
|
||||
irm https://cdn.link-ai.tech/code/cow/run.ps1 | iex
|
||||
```
|
||||
|
||||
脚本使用说明:[一键运行脚本](https://docs.cowagent.ai/guide/quick-start)。安装后可使用 `cow start`、`cow stop` 等 [CLI 命令](https://docs.cowagent.ai/commands/index) 管理服务。
|
||||
|
||||
|
||||
## 一、准备
|
||||
@@ -134,6 +145,24 @@ pip3 install -r requirements-optional.txt
|
||||
|
||||
如果某项依赖安装失败可注释掉对应的行后重试。
|
||||
|
||||
**(4) 安装 Cow CLI (推荐):**
|
||||
|
||||
```bash
|
||||
pip3 install -e .
|
||||
```
|
||||
|
||||
安装后可使用 `cow` 命令管理服务(启动、停止、更新等)和技能,详见 [命令文档](https://docs.cowagent.ai/commands/index)。
|
||||
|
||||
**(5) 安装浏览器工具 (可选):**
|
||||
|
||||
如果需要 Agent 操作浏览器(如访问网页、填写表单等),需要额外安装浏览器依赖:
|
||||
|
||||
```bash
|
||||
cow install-browser
|
||||
```
|
||||
|
||||
该命令会自动安装 `playwright` 和 Chromium 浏览器,国内网络自动使用镜像加速。详见 [浏览器工具文档](https://docs.cowagent.ai/tools/browser)。
|
||||
|
||||
## 二、配置
|
||||
|
||||
配置文件的模板在根目录的 `config-template.json` 中,需复制该模板创建最终生效的 `config.json` 文件:
|
||||
@@ -210,7 +239,8 @@ pip3 install -r requirements-optional.txt
|
||||
如果是个人计算机 **本地运行**,直接在项目根目录下执行:
|
||||
|
||||
```bash
|
||||
python3 app.py # windows 环境下该命令通常为 python app.py
|
||||
cow start # 推荐,需先安装 Cow CLI
|
||||
python3 app.py # 或直接运行,windows 环境下该命令通常为 python app.py
|
||||
```
|
||||
|
||||
运行后默认会启动 web 服务,可通过访问 `http://localhost:9899/chat` 在网页端对话。
|
||||
@@ -220,15 +250,24 @@ python3 app.py # windows 环境下该命令通常为 python app.py
|
||||
|
||||
### 2.服务器部署
|
||||
|
||||
在服务器中可使用 `nohup` 命令在后台运行程序:
|
||||
推荐使用 `cow` 命令管理服务:
|
||||
|
||||
```bash
|
||||
cow start # 后台启动
|
||||
cow stop # 停止服务
|
||||
cow restart # 重启服务
|
||||
cow status # 查看运行状态
|
||||
cow logs # 查看日志
|
||||
cow update # 拉取最新代码并重启
|
||||
```
|
||||
|
||||
也可以使用传统方式后台运行:
|
||||
|
||||
```bash
|
||||
nohup python3 app.py & tail -f nohup.out
|
||||
```
|
||||
|
||||
执行后程序运行于服务器后台,可通过 `ctrl+c` 关闭日志,不会影响后台程序的运行。使用 `ps -ef | grep app.py | grep -v grep` 命令可查看运行于后台的进程,如果想要重新启动程序可以先 `kill` 掉对应的进程。 日志关闭后如果想要再次打开只需输入 `tail -f nohup.out`。
|
||||
|
||||
此外,项目根目录下的 `run.sh` 脚本支持一键启动和管理服务,包括 `./run.sh start`、`./run.sh stop`、`./run.sh restart`、`./run.sh logs` 等命令,执行 `./run.sh help` 可查看全部用法。
|
||||
此外,项目根目录下的 `run.sh` 脚本也支持一键管理服务,包括 `./run.sh start`、`./run.sh stop`、`./run.sh restart` 等命令,执行 `./run.sh help` 可查看全部用法。
|
||||
|
||||
> 如果需要通过浏览器访问 Web 控制台,请确保服务器的 `9899` 端口已在防火墙或安全组中放行,建议仅对指定 IP 开放以保证安全。
|
||||
|
||||
@@ -830,8 +869,10 @@ QQ 机器人使用 WebSocket 长连接模式,无需公网 IP 和域名,支
|
||||
|
||||
# 🔗 相关项目
|
||||
|
||||
- [Cow Skill Hub](https://github.com/zhayujie/cow-skill-hub):开源的 AI Agent 技能广场,浏览、搜索、安装和发布技能,支持 CowAgent、OpenClaw、Claude Code 等多种 Agent。
|
||||
- [bot-on-anything](https://github.com/zhayujie/bot-on-anything):轻量和高可扩展的大模型应用框架,支持接入 Slack, Telegram, Discord, Gmail 等海外平台,可作为本项目的补充使用。
|
||||
- [AgentMesh](https://github.com/MinimalFuture/AgentMesh):开源的多智能体( Multi-Agent )框架,可以通过多智能体团队的协同来解决复杂问题。本项目基于该框架实现了[Agent 插件](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/plugins/agent/README.md),可访问终端、浏览器、文件系统、搜索引擎 等各类工具,并实现了多智能体协同。
|
||||
- [AgentMesh](https://github.com/MinimalFuture/AgentMesh):开源的多智能体( Multi-Agent )框架,可以通过多智能体团队的协同来解决复杂问题。
|
||||
|
||||
|
||||
|
||||
|
||||
@@ -843,7 +884,7 @@ FAQs: <https://github.com/zhayujie/chatgpt-on-wechat/wiki/FAQs>
|
||||
|
||||
# 🛠️ 开发
|
||||
|
||||
欢迎接入更多应用通道,参考 [飞书通道](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/channel/feishu/feishu_channel.py) 新增自定义通道,实现接收和发送消息逻辑即可完成接入。 同时欢迎贡献新的Skills,参考 [Skill创造器说明](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/skills/skill-creator/SKILL.md)。
|
||||
欢迎接入更多应用通道,参考 [飞书通道](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/channel/feishu/feishu_channel.py) 新增自定义通道,实现接收和发送消息逻辑即可完成接入。同时欢迎贡献新的 Skills,向 [Skill Hub](https://skills.cowagent.ai/submit) 提交技能。
|
||||
|
||||
# ✉ 联系
|
||||
|
||||
|
||||
@@ -75,6 +75,23 @@ class ChatService:
|
||||
# a new segment; collect tool results until turn_end.
|
||||
state.pending_tool_results = []
|
||||
|
||||
elif event_type == "file_to_send":
|
||||
url = data.get("url") or ""
|
||||
if url:
|
||||
fname = data.get("file_name") or "file"
|
||||
ft = data.get("file_type") or "file"
|
||||
if ft == "image":
|
||||
link = f""
|
||||
else:
|
||||
link = f"[{fname}]({url})"
|
||||
send_chunk_fn({
|
||||
"chunk_type": "content",
|
||||
"delta": "\n\n" + link + "\n\n",
|
||||
"segment_id": state.segment_id,
|
||||
})
|
||||
# Remove url so the model won't repeat it in its reply
|
||||
data.pop("url", None)
|
||||
|
||||
elif event_type == "tool_execution_start":
|
||||
# Notify the client that a tool is about to run (with its input args)
|
||||
tool_name = data.get("tool_name", "")
|
||||
|
||||
@@ -134,6 +134,8 @@ class MemoryService:
|
||||
else:
|
||||
return {"action": action, "code": 400, "message": f"unknown action: {action}", "payload": None}
|
||||
|
||||
except ValueError as e:
|
||||
return {"action": action, "code": 403, "message": "invalid filename", "payload": None}
|
||||
except FileNotFoundError as e:
|
||||
return {"action": action, "code": 404, "message": str(e), "payload": None}
|
||||
except Exception as e:
|
||||
@@ -145,14 +147,26 @@ class MemoryService:
|
||||
# ------------------------------------------------------------------
|
||||
def _resolve_path(self, filename: str) -> str:
|
||||
"""
|
||||
Resolve a filename to its absolute path.
|
||||
Safely resolve a filename to its absolute path within the allowed directory.
|
||||
|
||||
- ``MEMORY.md`` → ``{workspace_root}/MEMORY.md``
|
||||
- ``2026-02-20.md`` → ``{workspace_root}/memory/2026-02-20.md``
|
||||
|
||||
Raises ValueError if the resolved path escapes the allowed directory
|
||||
(path traversal protection).
|
||||
"""
|
||||
if filename == "MEMORY.md":
|
||||
return os.path.join(self.workspace_root, filename)
|
||||
return os.path.join(self.memory_dir, filename)
|
||||
base_dir = self.workspace_root
|
||||
else:
|
||||
base_dir = self.memory_dir
|
||||
|
||||
resolved = os.path.realpath(os.path.join(base_dir, filename))
|
||||
allowed = os.path.realpath(base_dir)
|
||||
|
||||
if resolved != allowed and not resolved.startswith(allowed + os.sep):
|
||||
raise ValueError(f"Invalid filename: path traversal detected")
|
||||
|
||||
return resolved
|
||||
|
||||
@staticmethod
|
||||
def _file_info(path: str, filename: str, file_type: str) -> dict:
|
||||
|
||||
@@ -16,16 +16,26 @@ from datetime import datetime
|
||||
from common.log import logger
|
||||
|
||||
|
||||
SUMMARIZE_SYSTEM_PROMPT = """你是一个记忆提取助手。你的任务是从对话记录中提取值得记住的信息,生成简洁的记忆摘要。
|
||||
SUMMARIZE_SYSTEM_PROMPT = """你是一个记忆提取助手。你的任务是从对话记录中提炼出值得长期记住的关键事件和核心信息。
|
||||
|
||||
核心原则:
|
||||
- 按「事件」维度归纳,而不是按对话轮次逐条记录
|
||||
- 多轮对话如果围绕同一件事,合并为一条摘要
|
||||
- 只记录有长期价值的信息,忽略闲聊、问候、无意义的短消息
|
||||
|
||||
输出要求:
|
||||
1. 以事件/关键信息为维度记录,每条一行,用 "- " 开头
|
||||
2. 记录有价值的关键信息,例如用户提出的要求及助手的解决方案,对话中涉及的事实信息,用户的偏好、决策或重要结论
|
||||
3. 每条摘要需要简明扼要,只保留关键信息
|
||||
4. 直接输出摘要内容,不要加任何前缀说明
|
||||
5. 当对话没有任何记录价值例如只是简单问候,可回复"无\""""
|
||||
1. 每条一行,用 "- " 开头,格式为:事件/主题 + 关键结论或结果
|
||||
2. 值得记录的信息类型:用户提出的需求及最终解决方案、重要的事实信息、用户的偏好或决策、关键技术方案或配置变更
|
||||
3. 不值得记录的信息:简单问候、闲聊、无实质内容的短消息、重复的中间过程
|
||||
4. 每条摘要应当简明扼要,一句话概括事件的核心内容和结果
|
||||
5. 直接输出摘要内容,不要加任何前缀说明
|
||||
6. 当对话没有任何记录价值(仅含问候或无意义内容),回复"无"
|
||||
|
||||
SUMMARIZE_USER_PROMPT = """请从以下对话记录中提取关键信息,生成记忆摘要:
|
||||
示例(仅供参考格式):
|
||||
- 用户配置了 XX 功能,设置参数为 YY,已生效
|
||||
- 用户反馈了 XX 问题,原因是 YY,通过 ZZ 方式解决"""
|
||||
|
||||
SUMMARIZE_USER_PROMPT = """请从以下对话记录中,按关键事件维度提炼记忆摘要(合并同一事件的多轮对话,不要逐条列出):
|
||||
|
||||
{conversation}"""
|
||||
|
||||
@@ -220,14 +230,16 @@ class MemoryFlushManager:
|
||||
if not conversation_text.strip():
|
||||
return ""
|
||||
|
||||
# Try LLM summarization first
|
||||
if self.llm_model:
|
||||
try:
|
||||
summary = self._call_llm_for_summary(conversation_text)
|
||||
if summary and summary.strip() and summary.strip() != "无":
|
||||
return summary.strip()
|
||||
logger.info(f"[MemoryFlush] LLM returned empty or '无', using fallback")
|
||||
except Exception as e:
|
||||
logger.warning(f"[MemoryFlush] LLM summarization failed, using fallback: {e}")
|
||||
else:
|
||||
logger.info("[MemoryFlush] No LLM model available, using rule-based fallback")
|
||||
|
||||
return self._extract_summary_fallback(messages, max_messages)
|
||||
|
||||
@@ -277,27 +289,38 @@ class MemoryFlushManager:
|
||||
|
||||
@staticmethod
|
||||
def _extract_summary_fallback(messages: List[Dict], max_messages: int = 0) -> str:
|
||||
"""Rule-based fallback when LLM is unavailable."""
|
||||
"""
|
||||
Rule-based fallback when LLM is unavailable.
|
||||
Groups consecutive user+assistant messages into events instead of
|
||||
listing each message individually.
|
||||
"""
|
||||
msgs = messages if max_messages == 0 else messages[-max_messages * 2:]
|
||||
|
||||
items = []
|
||||
|
||||
events: List[str] = []
|
||||
current_user_text = ""
|
||||
for msg in msgs:
|
||||
role = msg.get("role", "")
|
||||
text = MemoryFlushManager._extract_text_from_content(msg.get("content", ""))
|
||||
if not text or not text.strip():
|
||||
continue
|
||||
text = text.strip()
|
||||
|
||||
|
||||
if role == "user":
|
||||
if len(text) <= 5:
|
||||
continue
|
||||
items.append(f"- 用户请求: {text[:200]}")
|
||||
elif role == "assistant":
|
||||
current_user_text = text[:150]
|
||||
elif role == "assistant" and current_user_text:
|
||||
first_line = text.split("\n")[0].strip()
|
||||
if len(first_line) > 10:
|
||||
items.append(f"- 处理结果: {first_line[:200]}")
|
||||
|
||||
return "\n".join(items[:15])
|
||||
events.append(f"- {current_user_text} → {first_line[:150]}")
|
||||
else:
|
||||
events.append(f"- {current_user_text}")
|
||||
current_user_text = ""
|
||||
|
||||
if current_user_text:
|
||||
events.append(f"- {current_user_text}")
|
||||
|
||||
return "\n".join(events[:10])
|
||||
|
||||
@staticmethod
|
||||
def _extract_text_from_content(content) -> str:
|
||||
|
||||
@@ -165,12 +165,13 @@ def _build_tooling_section(tools: List[Any], language: str) -> List[str]:
|
||||
"terminal": "管理后台进程",
|
||||
"web_search": "网络搜索",
|
||||
"web_fetch": "获取URL内容",
|
||||
"browser": "控制浏览器",
|
||||
"browser": "控制浏览器(关键结果或需要协助可截图发送给用户)",
|
||||
"memory_search": "搜索记忆",
|
||||
"memory_get": "读取记忆内容",
|
||||
"env_config": "管理API密钥和技能配置",
|
||||
"scheduler": "管理定时任务和提醒",
|
||||
"send": "发送本地文件给用户(仅限本地文件,URL直接放在回复文本中)",
|
||||
"vision": "分析图片内容(识别、描述、OCR文字提取等)",
|
||||
}
|
||||
|
||||
# Preferred display order
|
||||
@@ -179,7 +180,7 @@ def _build_tooling_section(tools: List[Any], language: str) -> List[str]:
|
||||
"bash", "terminal",
|
||||
"web_search", "web_fetch", "browser",
|
||||
"memory_search", "memory_get",
|
||||
"env_config", "scheduler", "send",
|
||||
"env_config", "scheduler", "send", "vision",
|
||||
]
|
||||
|
||||
# Build name -> summary mapping for available tools
|
||||
@@ -383,7 +384,7 @@ def _build_workspace_section(workspace_dir: str, language: str) -> List[str]:
|
||||
"**💬 交流规范**:",
|
||||
"",
|
||||
"- 对话中不要暴露内部技术细节(文件名、工具名等),用自然语言表达。例如说「我已记住」而非「已更新 MEMORY.md」",
|
||||
"- 做真正有帮助的助手,而不是表演式的客套。跳过「好的!」「当然可以!」之类的套话,直接帮忙解决问题",
|
||||
"- 做真正有帮助的助手,而不是表演式的客套,尽可能帮忙解决问题",
|
||||
"- 回复应结构清晰、重点突出。善用 **加粗**、列表、分段等格式让信息一目了然",
|
||||
"- 适当使用 emoji 让表达更生动自然 🎯,但不要过度堆砌",
|
||||
"",
|
||||
|
||||
@@ -300,13 +300,13 @@ class AgentStreamExecutor:
|
||||
f"with same arguments. This may indicate a loop."
|
||||
)
|
||||
|
||||
# Check if this is a file to send (from read tool)
|
||||
# Check if this is a file to send
|
||||
if result.get("status") == "success" and isinstance(result.get("result"), dict):
|
||||
result_data = result.get("result")
|
||||
if result_data.get("type") == "file_to_send":
|
||||
# Store file metadata for later sending
|
||||
self.files_to_send.append(result_data)
|
||||
logger.info(f"📎 检测到待发送文件: {result_data.get('file_name', result_data.get('path'))}")
|
||||
self._emit_event("file_to_send", result_data)
|
||||
|
||||
# Check for critical error - abort entire conversation
|
||||
if result.get("status") == "critical_error":
|
||||
|
||||
@@ -102,13 +102,17 @@ class SkillManager:
|
||||
else:
|
||||
enabled = entry.metadata.default_enabled if entry.metadata else True
|
||||
|
||||
merged[name] = {
|
||||
entry_dict = {
|
||||
"name": name,
|
||||
"description": skill.description,
|
||||
"source": prev.get("source") or skill.source,
|
||||
"enabled": enabled,
|
||||
"category": category,
|
||||
}
|
||||
display_name = prev.get("display_name")
|
||||
if display_name:
|
||||
entry_dict["display_name"] = display_name
|
||||
merged[name] = entry_dict
|
||||
|
||||
self.skills_config = merged
|
||||
self._save_skills_config()
|
||||
|
||||
@@ -87,25 +87,25 @@ FileSave = _optional_tools.get('FileSave')
|
||||
Terminal = _optional_tools.get('Terminal')
|
||||
|
||||
|
||||
# Delayed import for BrowserTool
|
||||
# BrowserTool (requires playwright)
|
||||
def _import_browser_tool():
|
||||
from common.log import logger
|
||||
try:
|
||||
from agent.tools.browser.browser_tool import BrowserTool
|
||||
return BrowserTool
|
||||
except ImportError:
|
||||
# Return a placeholder class that will prompt the user to install dependencies when instantiated
|
||||
class BrowserToolPlaceholder:
|
||||
def __init__(self, *args, **kwargs):
|
||||
raise ImportError(
|
||||
"The 'browser-use' package is required to use BrowserTool. "
|
||||
"Please install it with 'pip install browser-use>=0.1.40'."
|
||||
)
|
||||
except ImportError as e:
|
||||
logger.info(
|
||||
f"[Tools] BrowserTool not loaded - missing dependency: {e}\n"
|
||||
f" To enable browser tool, run:\n"
|
||||
f" pip install playwright\n"
|
||||
f" playwright install chromium"
|
||||
)
|
||||
return None
|
||||
except Exception as e:
|
||||
logger.error(f"[Tools] BrowserTool failed to load: {e}")
|
||||
return None
|
||||
|
||||
return BrowserToolPlaceholder
|
||||
|
||||
|
||||
# Dynamically set BrowserTool
|
||||
# BrowserTool = _import_browser_tool()
|
||||
BrowserTool = _import_browser_tool()
|
||||
|
||||
# Export all tools (including optional ones that might be None)
|
||||
__all__ = [
|
||||
@@ -124,8 +124,7 @@ __all__ = [
|
||||
'WebSearch',
|
||||
'WebFetch',
|
||||
'Vision',
|
||||
# Optional tools (may be None if dependencies not available)
|
||||
# 'BrowserTool'
|
||||
'BrowserTool',
|
||||
]
|
||||
|
||||
"""
|
||||
|
||||
3
agent/tools/browser/__init__.py
Normal file
3
agent/tools/browser/__init__.py
Normal file
@@ -0,0 +1,3 @@
|
||||
from agent.tools.browser.browser_tool import BrowserTool
|
||||
|
||||
__all__ = ["BrowserTool"]
|
||||
708
agent/tools/browser/browser_service.py
Normal file
708
agent/tools/browser/browser_service.py
Normal file
@@ -0,0 +1,708 @@
|
||||
"""
|
||||
Browser service - Playwright wrapper managing browser lifecycle and page operations.
|
||||
|
||||
All Playwright calls run on a dedicated background thread so that callers from
|
||||
any worker thread can safely use the service. An idle-timeout mechanism
|
||||
automatically shuts down the browser (and its thread) after a configurable
|
||||
period of inactivity to free resources.
|
||||
"""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import uuid
|
||||
import queue
|
||||
import threading
|
||||
from typing import Optional, Dict, Any, List, Callable
|
||||
|
||||
from common.log import logger
|
||||
|
||||
try:
|
||||
from playwright.sync_api import sync_playwright, Browser, BrowserContext, Page, Playwright
|
||||
_HAS_PLAYWRIGHT = True
|
||||
except ImportError:
|
||||
_HAS_PLAYWRIGHT = False
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Snapshot DOM helpers
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# Tags that typically carry useful content for an agent
|
||||
_INTERACTIVE_TAGS = {
|
||||
"a", "button", "input", "textarea", "select", "option",
|
||||
"label", "details", "summary",
|
||||
}
|
||||
_SEMANTIC_TAGS = {
|
||||
"h1", "h2", "h3", "h4", "h5", "h6",
|
||||
"p", "li", "td", "th", "caption", "figcaption", "blockquote", "pre", "code",
|
||||
"nav", "main", "article", "section", "header", "footer", "form", "table",
|
||||
"img", "video", "audio",
|
||||
}
|
||||
_KEEP_TAGS = _INTERACTIVE_TAGS | _SEMANTIC_TAGS
|
||||
|
||||
_SNAPSHOT_JS = """
|
||||
() => {
|
||||
const KEEP = new Set(%s);
|
||||
const INTERACTIVE = new Set(%s);
|
||||
const SKIP = new Set(["script","style","noscript","svg","path","meta","link","br","hr"]);
|
||||
let refCounter = 0;
|
||||
const refMap = {};
|
||||
|
||||
function visible(el) {
|
||||
if (!(el instanceof HTMLElement)) return true;
|
||||
const st = window.getComputedStyle(el);
|
||||
if (st.display === "none" || st.visibility === "hidden") return false;
|
||||
if (parseFloat(st.opacity) === 0) return false;
|
||||
return true;
|
||||
}
|
||||
|
||||
function walk(node) {
|
||||
if (node.nodeType === Node.TEXT_NODE) {
|
||||
const t = node.textContent.trim();
|
||||
return t ? t : null;
|
||||
}
|
||||
if (node.nodeType !== Node.ELEMENT_NODE) return null;
|
||||
const tag = node.tagName.toLowerCase();
|
||||
if (SKIP.has(tag)) return null;
|
||||
if (!visible(node)) return null;
|
||||
|
||||
const children = [];
|
||||
for (const ch of node.childNodes) {
|
||||
const r = walk(ch);
|
||||
if (r !== null) {
|
||||
if (typeof r === "string") children.push(r);
|
||||
else children.push(r);
|
||||
}
|
||||
}
|
||||
|
||||
const keep = KEEP.has(tag);
|
||||
if (!keep) {
|
||||
// Unwrap: promote children
|
||||
if (children.length === 0) return null;
|
||||
if (children.length === 1) return children[0];
|
||||
return children;
|
||||
}
|
||||
|
||||
const obj = { tag };
|
||||
if (INTERACTIVE.has(tag)) {
|
||||
refCounter++;
|
||||
obj.ref = refCounter;
|
||||
refMap[refCounter] = node;
|
||||
}
|
||||
|
||||
// Attributes
|
||||
if (tag === "a" && node.href) obj.href = node.getAttribute("href");
|
||||
if (tag === "img") {
|
||||
obj.alt = node.alt || "";
|
||||
obj.src = node.getAttribute("src") || "";
|
||||
}
|
||||
if (tag === "input" || tag === "textarea" || tag === "select") {
|
||||
obj.type = node.type || "text";
|
||||
obj.name = node.name || undefined;
|
||||
obj.value = node.value || undefined;
|
||||
obj.placeholder = node.placeholder || undefined;
|
||||
if (node.disabled) obj.disabled = true;
|
||||
if (tag === "input" && node.type === "checkbox") obj.checked = node.checked;
|
||||
}
|
||||
if (tag === "button") {
|
||||
if (node.disabled) obj.disabled = true;
|
||||
}
|
||||
if (tag === "option") {
|
||||
obj.value = node.value;
|
||||
if (node.selected) obj.selected = true;
|
||||
}
|
||||
if (tag === "label" && node.htmlFor) obj.for = node.htmlFor;
|
||||
|
||||
// Role / aria-label
|
||||
const role = node.getAttribute("role");
|
||||
if (role) obj.role = role;
|
||||
const ariaLabel = node.getAttribute("aria-label");
|
||||
if (ariaLabel) obj.ariaLabel = ariaLabel;
|
||||
|
||||
// Children
|
||||
if (children.length === 1 && typeof children[0] === "string") {
|
||||
obj.text = children[0];
|
||||
} else if (children.length > 0) {
|
||||
obj.children = children;
|
||||
}
|
||||
|
||||
return obj;
|
||||
}
|
||||
|
||||
// Store refMap on window for later use by click/fill actions
|
||||
const result = walk(document.body);
|
||||
window.__cowRefMap = refMap;
|
||||
return { tree: result, refCount: refCounter };
|
||||
}
|
||||
""" % (
|
||||
str(list(_KEEP_TAGS)),
|
||||
str(list(_INTERACTIVE_TAGS)),
|
||||
)
|
||||
|
||||
|
||||
def _should_use_headless() -> bool:
|
||||
"""Decide headless mode: headless on Linux servers without display, headed elsewhere."""
|
||||
if sys.platform in ("win32", "darwin"):
|
||||
return False
|
||||
# Linux: check for display
|
||||
if os.environ.get("DISPLAY") or os.environ.get("WAYLAND_DISPLAY"):
|
||||
return False
|
||||
return True
|
||||
|
||||
|
||||
def _flatten_tree(node, indent=0) -> List[str]:
|
||||
"""Convert snapshot tree to compact text lines for LLM consumption."""
|
||||
if node is None:
|
||||
return []
|
||||
if isinstance(node, str):
|
||||
return [" " * indent + node]
|
||||
if isinstance(node, list):
|
||||
lines = []
|
||||
for child in node:
|
||||
lines.extend(_flatten_tree(child, indent))
|
||||
return lines
|
||||
if not isinstance(node, dict):
|
||||
return []
|
||||
|
||||
tag = node.get("tag", "?")
|
||||
ref = node.get("ref")
|
||||
parts = [tag]
|
||||
if ref:
|
||||
parts[0] = f"[{ref}] {tag}"
|
||||
|
||||
# Inline attributes
|
||||
for attr in ("type", "name", "href", "alt", "role", "ariaLabel", "placeholder", "value"):
|
||||
val = node.get(attr)
|
||||
if val:
|
||||
# Truncate long values
|
||||
s = str(val)
|
||||
if len(s) > 80:
|
||||
s = s[:77] + "..."
|
||||
parts.append(f'{attr}="{s}"')
|
||||
|
||||
for flag in ("disabled", "checked", "selected"):
|
||||
if node.get(flag):
|
||||
parts.append(flag)
|
||||
|
||||
prefix = " " * indent
|
||||
header = prefix + " ".join(parts)
|
||||
|
||||
text = node.get("text")
|
||||
if text:
|
||||
# Truncate long text
|
||||
if len(text) > 120:
|
||||
text = text[:117] + "..."
|
||||
header += f": {text}"
|
||||
|
||||
lines = [header]
|
||||
children = node.get("children", [])
|
||||
for child in children:
|
||||
lines.extend(_flatten_tree(child, indent + 2))
|
||||
return lines
|
||||
|
||||
|
||||
class BrowserService:
|
||||
"""Manages a Playwright browser on a dedicated background thread.
|
||||
|
||||
All Playwright operations are dispatched to a single long-lived thread via
|
||||
a task queue. Callers from *any* worker thread can use the public API
|
||||
safely. An idle timer automatically shuts the browser down after
|
||||
``idle_timeout`` seconds of inactivity (default 300 = 5 min).
|
||||
"""
|
||||
|
||||
_IDLE_TIMEOUT_DEFAULT = 300 # seconds
|
||||
|
||||
def __init__(self, config: Optional[Dict[str, Any]] = None):
|
||||
self._config = config or {}
|
||||
self._headless: Optional[bool] = None
|
||||
self._screenshot_dir: Optional[str] = None
|
||||
|
||||
# Background thread state
|
||||
self._thread: Optional[threading.Thread] = None
|
||||
self._task_queue: queue.Queue = queue.Queue()
|
||||
self._lock = threading.Lock()
|
||||
self._alive = False
|
||||
self._ready = threading.Event()
|
||||
|
||||
# Playwright objects (only accessed on the background thread)
|
||||
self._playwright = None
|
||||
self._browser = None
|
||||
self._context = None
|
||||
self._page = None
|
||||
|
||||
# Idle auto-release
|
||||
idle_cfg = self._config.get("idle_timeout")
|
||||
self._idle_timeout: float = float(idle_cfg) if idle_cfg is not None else self._IDLE_TIMEOUT_DEFAULT
|
||||
self._idle_timer: Optional[threading.Timer] = None
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Background-thread lifecycle
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def _start_thread(self):
|
||||
"""Start the dedicated Playwright thread if not already running."""
|
||||
with self._lock:
|
||||
if self._alive and self._thread and self._thread.is_alive():
|
||||
return
|
||||
# Wait for old thread to fully exit before creating a new one
|
||||
old = self._thread
|
||||
if old and old.is_alive():
|
||||
old.join(timeout=5)
|
||||
# Fresh queue to avoid stale sentinels from a previous close()
|
||||
self._task_queue = queue.Queue()
|
||||
self._alive = True
|
||||
self._ready = threading.Event()
|
||||
self._thread = threading.Thread(target=self._run_loop, daemon=True, name="BrowserThread")
|
||||
self._thread.start()
|
||||
# Block until browser is ready (or failed)
|
||||
self._ready.wait(timeout=30)
|
||||
|
||||
def _run_loop(self):
|
||||
"""Event loop running on the dedicated thread. Processes tasks until stopped."""
|
||||
logger.info("[Browser] Background thread started")
|
||||
try:
|
||||
self._launch_browser()
|
||||
except Exception as e:
|
||||
logger.error(f"[Browser] Failed to launch browser: {e}")
|
||||
self._alive = False
|
||||
self._ready.set()
|
||||
self._drain_queue(RuntimeError(f"Browser launch failed: {e}"))
|
||||
return
|
||||
self._ready.set()
|
||||
|
||||
while self._alive:
|
||||
try:
|
||||
task = self._task_queue.get(timeout=1.0)
|
||||
except queue.Empty:
|
||||
continue
|
||||
if task is None:
|
||||
break
|
||||
fn, args, kwargs, result_slot = task
|
||||
try:
|
||||
result_slot["value"] = fn(*args, **kwargs)
|
||||
except Exception as e:
|
||||
result_slot["error"] = e
|
||||
finally:
|
||||
result_slot["event"].set()
|
||||
|
||||
self._shutdown_browser()
|
||||
self._drain_queue(RuntimeError("Browser thread stopped"))
|
||||
logger.info("[Browser] Background thread exited")
|
||||
|
||||
def _drain_queue(self, error: Exception):
|
||||
"""Unblock all callers waiting on the queue with an error."""
|
||||
while True:
|
||||
try:
|
||||
task = self._task_queue.get_nowait()
|
||||
except queue.Empty:
|
||||
break
|
||||
if task is None:
|
||||
continue
|
||||
_, _, _, result_slot = task
|
||||
result_slot["error"] = error
|
||||
result_slot["event"].set()
|
||||
|
||||
def _launch_browser(self):
|
||||
"""Launch Chromium on the background thread."""
|
||||
if self._headless is None:
|
||||
headless_cfg = self._config.get("headless")
|
||||
self._headless = headless_cfg if headless_cfg is not None else _should_use_headless()
|
||||
|
||||
launch_args = ["--disable-dev-shm-usage"]
|
||||
if self._headless:
|
||||
launch_args.append("--no-sandbox")
|
||||
|
||||
extra_args = self._config.get("launch_args", [])
|
||||
if extra_args:
|
||||
launch_args.extend(extra_args)
|
||||
|
||||
viewport_w = self._config.get("viewport_width", 1280)
|
||||
viewport_h = self._config.get("viewport_height", 720)
|
||||
|
||||
self._playwright = sync_playwright().start()
|
||||
logger.info(f"[Browser] Launching Chromium (headless={self._headless})")
|
||||
self._browser = self._playwright.chromium.launch(
|
||||
headless=self._headless,
|
||||
args=launch_args,
|
||||
)
|
||||
self._context = self._browser.new_context(
|
||||
viewport={"width": viewport_w, "height": viewport_h},
|
||||
user_agent=(
|
||||
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) "
|
||||
"AppleWebKit/537.36 (KHTML, like Gecko) "
|
||||
"Chrome/131.0.0.0 Safari/537.36"
|
||||
),
|
||||
)
|
||||
self._page = self._context.new_page()
|
||||
logger.info("[Browser] Browser ready")
|
||||
|
||||
def _shutdown_browser(self):
|
||||
"""Shut down all Playwright resources on the background thread."""
|
||||
self._cancel_idle_timer()
|
||||
for obj, label in [
|
||||
(self._context, "context"),
|
||||
(self._browser, "browser"),
|
||||
]:
|
||||
try:
|
||||
if obj:
|
||||
obj.close()
|
||||
except Exception as e:
|
||||
logger.debug(f"[Browser] {label} close error: {e}")
|
||||
try:
|
||||
if self._playwright:
|
||||
self._playwright.stop()
|
||||
except Exception as e:
|
||||
logger.debug(f"[Browser] playwright stop error: {e}")
|
||||
self._page = None
|
||||
self._context = None
|
||||
self._browser = None
|
||||
self._playwright = None
|
||||
logger.info("[Browser] Browser closed")
|
||||
|
||||
def _submit(self, fn: Callable, *args, **kwargs):
|
||||
"""Submit *fn* to the background thread and block until it completes."""
|
||||
self._start_thread()
|
||||
|
||||
if not self._alive:
|
||||
raise RuntimeError("Browser is not available")
|
||||
|
||||
self._reset_idle_timer()
|
||||
|
||||
result_slot: Dict[str, Any] = {"event": threading.Event()}
|
||||
self._task_queue.put((fn, args, kwargs, result_slot))
|
||||
|
||||
# Timeout prevents permanent hang if the background thread crashes
|
||||
completed = result_slot["event"].wait(timeout=120)
|
||||
if not completed:
|
||||
raise TimeoutError("Browser operation timed out (120s)")
|
||||
|
||||
if "error" in result_slot:
|
||||
raise result_slot["error"]
|
||||
return result_slot.get("value")
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Idle auto-release
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def _reset_idle_timer(self):
|
||||
self._cancel_idle_timer()
|
||||
if self._idle_timeout > 0:
|
||||
self._idle_timer = threading.Timer(self._idle_timeout, self._on_idle_timeout)
|
||||
self._idle_timer.daemon = True
|
||||
self._idle_timer.start()
|
||||
|
||||
def _cancel_idle_timer(self):
|
||||
if self._idle_timer:
|
||||
self._idle_timer.cancel()
|
||||
self._idle_timer = None
|
||||
|
||||
def _on_idle_timeout(self):
|
||||
logger.info(f"[Browser] Idle for {self._idle_timeout}s, auto-releasing browser")
|
||||
self.close()
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Public lifecycle
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def close(self):
|
||||
"""Shut down browser and background thread (safe from any thread)."""
|
||||
self._cancel_idle_timer()
|
||||
with self._lock:
|
||||
if not self._alive:
|
||||
return
|
||||
self._alive = False
|
||||
t = self._thread
|
||||
if self._task_queue is not None:
|
||||
self._task_queue.put(None)
|
||||
if t is not None and t.is_alive():
|
||||
t.join(timeout=10)
|
||||
with self._lock:
|
||||
self._thread = None
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Actions (each method is dispatched to the background thread)
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def navigate(self, url: str, timeout: int = 30000) -> Dict[str, Any]:
|
||||
return self._submit(self._do_navigate, url, timeout)
|
||||
|
||||
def _do_navigate(self, url: str, timeout: int) -> Dict[str, Any]:
|
||||
page = self._page
|
||||
try:
|
||||
resp = page.goto(url, wait_until="domcontentloaded", timeout=timeout)
|
||||
status = resp.status if resp else None
|
||||
except Exception as e:
|
||||
return {"error": f"Navigation failed: {e}"}
|
||||
|
||||
try:
|
||||
page.wait_for_load_state("networkidle", timeout=8000)
|
||||
except Exception:
|
||||
pass
|
||||
page.wait_for_timeout(500)
|
||||
|
||||
try:
|
||||
title = page.title()
|
||||
except Exception:
|
||||
title = ""
|
||||
try:
|
||||
current_url = page.url
|
||||
except Exception:
|
||||
current_url = url
|
||||
|
||||
return {"url": current_url, "title": title, "status": status}
|
||||
|
||||
def snapshot(self, selector: Optional[str] = None) -> str:
|
||||
return self._submit(self._do_snapshot, selector)
|
||||
|
||||
def _do_snapshot(self, selector: Optional[str] = None) -> str:
|
||||
page = self._page
|
||||
try:
|
||||
result = page.evaluate(_SNAPSHOT_JS)
|
||||
except Exception as e:
|
||||
return f"[Snapshot error: {e}]"
|
||||
|
||||
tree = result.get("tree")
|
||||
ref_count = result.get("refCount", 0)
|
||||
lines = _flatten_tree(tree)
|
||||
|
||||
try:
|
||||
title = page.title()
|
||||
except Exception:
|
||||
title = ""
|
||||
try:
|
||||
url = page.url
|
||||
except Exception:
|
||||
url = ""
|
||||
|
||||
header = f"Page: {title} ({url})\nInteractive elements: {ref_count}\n---"
|
||||
body = "\n".join(lines)
|
||||
|
||||
max_chars = self._config.get("snapshot_max_chars", 30000)
|
||||
if len(body) > max_chars:
|
||||
body = body[:max_chars] + "\n... [snapshot truncated]"
|
||||
|
||||
return f"{header}\n{body}"
|
||||
|
||||
def screenshot(self, full_page: bool = False, cwd: str = "") -> str:
|
||||
return self._submit(self._do_screenshot, full_page, cwd)
|
||||
|
||||
def _do_screenshot(self, full_page: bool = False, cwd: str = "") -> str:
|
||||
page = self._page
|
||||
save_dir = self._get_screenshot_dir(cwd)
|
||||
filename = f"screenshot_{uuid.uuid4().hex[:8]}.png"
|
||||
filepath = os.path.join(save_dir, filename)
|
||||
page.screenshot(path=filepath, full_page=full_page)
|
||||
logger.info(f"[Browser] Screenshot saved: {filepath}")
|
||||
return filepath
|
||||
|
||||
def click(self, ref: Optional[int] = None, selector: Optional[str] = None,
|
||||
timeout: int = 5000) -> Dict[str, Any]:
|
||||
return self._submit(self._do_click, ref, selector, timeout)
|
||||
|
||||
def _do_click(self, ref, selector, timeout) -> Dict[str, Any]:
|
||||
page = self._page
|
||||
try:
|
||||
if ref is not None:
|
||||
result = page.evaluate(f"""
|
||||
() => {{
|
||||
const el = window.__cowRefMap && window.__cowRefMap[{ref}];
|
||||
if (!el) return {{ error: "ref {ref} not found. Run snapshot first." }};
|
||||
el.click();
|
||||
return {{ clicked: true, tag: el.tagName.toLowerCase() }};
|
||||
}}
|
||||
""")
|
||||
if result.get("error"):
|
||||
return result
|
||||
page.wait_for_timeout(500)
|
||||
return result
|
||||
elif selector:
|
||||
page.click(selector, timeout=timeout)
|
||||
return {"clicked": True, "selector": selector}
|
||||
else:
|
||||
return {"error": "Provide either ref (from snapshot) or selector"}
|
||||
except Exception as e:
|
||||
return {"error": f"Click failed: {e}"}
|
||||
|
||||
def fill(self, text: str, ref: Optional[int] = None,
|
||||
selector: Optional[str] = None, timeout: int = 5000) -> Dict[str, Any]:
|
||||
return self._submit(self._do_fill, text, ref, selector, timeout)
|
||||
|
||||
def _do_fill(self, text, ref, selector, timeout) -> Dict[str, Any]:
|
||||
page = self._page
|
||||
try:
|
||||
if ref is not None:
|
||||
result = page.evaluate(f"""
|
||||
() => {{
|
||||
const el = window.__cowRefMap && window.__cowRefMap[{ref}];
|
||||
if (!el) return {{ error: "ref {ref} not found. Run snapshot first." }};
|
||||
el.focus();
|
||||
el.value = "";
|
||||
return {{ tag: el.tagName.toLowerCase(), name: el.name || "" }};
|
||||
}}
|
||||
""")
|
||||
if result.get("error"):
|
||||
return result
|
||||
page.keyboard.type(text)
|
||||
return {"filled": True, "ref": ref, "text": text}
|
||||
elif selector:
|
||||
page.fill(selector, text, timeout=timeout)
|
||||
return {"filled": True, "selector": selector, "text": text}
|
||||
else:
|
||||
return {"error": "Provide either ref (from snapshot) or selector"}
|
||||
except Exception as e:
|
||||
return {"error": f"Fill failed: {e}"}
|
||||
|
||||
def select(self, value: str, ref: Optional[int] = None,
|
||||
selector: Optional[str] = None, timeout: int = 5000) -> Dict[str, Any]:
|
||||
return self._submit(self._do_select, value, ref, selector, timeout)
|
||||
|
||||
def _do_select(self, value, ref, selector, timeout) -> Dict[str, Any]:
|
||||
page = self._page
|
||||
try:
|
||||
if ref is not None:
|
||||
result = page.evaluate(f"""
|
||||
() => {{
|
||||
const el = window.__cowRefMap && window.__cowRefMap[{ref}];
|
||||
if (!el || el.tagName.toLowerCase() !== "select")
|
||||
return {{ error: "ref {ref} is not a <select> element" }};
|
||||
el.value = {repr(value)};
|
||||
el.dispatchEvent(new Event("change", {{ bubbles: true }}));
|
||||
return {{ selected: true, value: el.value }};
|
||||
}}
|
||||
""")
|
||||
return result
|
||||
elif selector:
|
||||
page.select_option(selector, value, timeout=timeout)
|
||||
return {"selected": True, "selector": selector, "value": value}
|
||||
else:
|
||||
return {"error": "Provide either ref (from snapshot) or selector"}
|
||||
except Exception as e:
|
||||
return {"error": f"Select failed: {e}"}
|
||||
|
||||
def scroll(self, direction: str = "down", amount: int = 500) -> Dict[str, Any]:
|
||||
return self._submit(self._do_scroll, direction, amount)
|
||||
|
||||
def _do_scroll(self, direction, amount) -> Dict[str, Any]:
|
||||
page = self._page
|
||||
delta_map = {
|
||||
"down": (0, amount),
|
||||
"up": (0, -amount),
|
||||
"right": (amount, 0),
|
||||
"left": (-amount, 0),
|
||||
}
|
||||
dx, dy = delta_map.get(direction, (0, amount))
|
||||
try:
|
||||
page.mouse.wheel(dx, dy)
|
||||
page.wait_for_timeout(300)
|
||||
scroll_info = page.evaluate("""
|
||||
() => ({
|
||||
scrollX: window.scrollX,
|
||||
scrollY: window.scrollY,
|
||||
scrollHeight: document.documentElement.scrollHeight,
|
||||
clientHeight: document.documentElement.clientHeight
|
||||
})
|
||||
""")
|
||||
return {"scrolled": direction, "amount": amount, **scroll_info}
|
||||
except Exception as e:
|
||||
return {"error": f"Scroll failed: {e}"}
|
||||
|
||||
def wait(self, selector: Optional[str] = None, timeout: int = 5000,
|
||||
state: str = "visible") -> Dict[str, Any]:
|
||||
return self._submit(self._do_wait, selector, timeout, state)
|
||||
|
||||
def _do_wait(self, selector, timeout, state) -> Dict[str, Any]:
|
||||
page = self._page
|
||||
try:
|
||||
if selector:
|
||||
page.wait_for_selector(selector, timeout=timeout, state=state)
|
||||
return {"waited": True, "selector": selector, "state": state}
|
||||
else:
|
||||
page.wait_for_timeout(timeout)
|
||||
return {"waited": True, "timeout_ms": timeout}
|
||||
except Exception as e:
|
||||
return {"error": f"Wait failed: {e}"}
|
||||
|
||||
def go_back(self) -> Dict[str, Any]:
|
||||
return self._submit(self._do_go_back)
|
||||
|
||||
def _do_go_back(self) -> Dict[str, Any]:
|
||||
page = self._page
|
||||
try:
|
||||
page.go_back(wait_until="domcontentloaded", timeout=10000)
|
||||
try:
|
||||
title = page.title()
|
||||
except Exception:
|
||||
title = ""
|
||||
try:
|
||||
url = page.url
|
||||
except Exception:
|
||||
url = ""
|
||||
return {"url": url, "title": title}
|
||||
except Exception as e:
|
||||
return {"error": f"Go back failed: {e}"}
|
||||
|
||||
def go_forward(self) -> Dict[str, Any]:
|
||||
return self._submit(self._do_go_forward)
|
||||
|
||||
def _do_go_forward(self) -> Dict[str, Any]:
|
||||
page = self._page
|
||||
try:
|
||||
page.go_forward(wait_until="domcontentloaded", timeout=10000)
|
||||
try:
|
||||
title = page.title()
|
||||
except Exception:
|
||||
title = ""
|
||||
try:
|
||||
url = page.url
|
||||
except Exception:
|
||||
url = ""
|
||||
return {"url": url, "title": title}
|
||||
except Exception as e:
|
||||
return {"error": f"Go forward failed: {e}"}
|
||||
|
||||
def get_text(self, selector: str) -> Dict[str, Any]:
|
||||
return self._submit(self._do_get_text, selector)
|
||||
|
||||
def _do_get_text(self, selector) -> Dict[str, Any]:
|
||||
page = self._page
|
||||
try:
|
||||
text = page.text_content(selector, timeout=5000)
|
||||
return {"text": text or ""}
|
||||
except Exception as e:
|
||||
return {"error": f"Get text failed: {e}"}
|
||||
|
||||
def evaluate(self, script: str) -> Dict[str, Any]:
|
||||
return self._submit(self._do_evaluate, script)
|
||||
|
||||
def _do_evaluate(self, script) -> Dict[str, Any]:
|
||||
page = self._page
|
||||
try:
|
||||
result = page.evaluate(script)
|
||||
return {"result": result}
|
||||
except Exception as e:
|
||||
return {"error": f"Evaluate failed: {e}"}
|
||||
|
||||
def press(self, key: str) -> Dict[str, Any]:
|
||||
return self._submit(self._do_press, key)
|
||||
|
||||
def _do_press(self, key) -> Dict[str, Any]:
|
||||
page = self._page
|
||||
try:
|
||||
page.keyboard.press(key)
|
||||
page.wait_for_timeout(300)
|
||||
return {"pressed": key}
|
||||
except Exception as e:
|
||||
return {"error": f"Press failed: {e}"}
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Helpers
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def _get_screenshot_dir(self, cwd: str = "") -> str:
|
||||
if self._screenshot_dir and os.path.isdir(self._screenshot_dir):
|
||||
return self._screenshot_dir
|
||||
base = cwd or os.getcwd()
|
||||
d = os.path.join(base, "tmp")
|
||||
os.makedirs(d, exist_ok=True)
|
||||
self._screenshot_dir = d
|
||||
return d
|
||||
290
agent/tools/browser/browser_tool.py
Normal file
290
agent/tools/browser/browser_tool.py
Normal file
@@ -0,0 +1,290 @@
|
||||
"""
|
||||
Browser tool - Control a Chromium browser for web navigation and interaction.
|
||||
|
||||
Uses Playwright under the hood. Browser instance is lazily started on first
|
||||
use, reused across tool calls within the same session, and cleaned up via
|
||||
close().
|
||||
"""
|
||||
|
||||
import json
|
||||
import os
|
||||
from typing import Dict, Any, Optional
|
||||
|
||||
from agent.tools.base_tool import BaseTool, ToolResult
|
||||
from agent.tools.browser.browser_service import BrowserService
|
||||
from common.log import logger
|
||||
|
||||
|
||||
class BrowserTool(BaseTool):
|
||||
"""Single tool exposing all browser actions via an 'action' parameter."""
|
||||
|
||||
name: str = "browser"
|
||||
description: str = (
|
||||
"Control a browser to navigate web pages, interact with elements, and extract content. "
|
||||
"Actions: navigate, snapshot, click, fill, select, scroll, screenshot, wait, back, forward, "
|
||||
"get_text, press, evaluate.\n\n"
|
||||
"Workflow: navigate (auto-includes snapshot with element refs) → click/fill/select by ref → snapshot to verify.\n\n"
|
||||
"Use snapshot as the primary way to read pages. Use screenshot + send to show key results to the user. "
|
||||
"For login/CAPTCHA/authorization etc., screenshot and ask the user for help."
|
||||
)
|
||||
|
||||
params: dict = {
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"action": {
|
||||
"type": "string",
|
||||
"description": (
|
||||
"The browser action to perform. One of: "
|
||||
"navigate, snapshot, click, fill, select, scroll, "
|
||||
"screenshot, wait, back, forward, get_text, press, evaluate"
|
||||
),
|
||||
"enum": [
|
||||
"navigate", "snapshot", "click", "fill", "select", "scroll",
|
||||
"screenshot", "wait", "back", "forward", "get_text", "press",
|
||||
"evaluate"
|
||||
]
|
||||
},
|
||||
"url": {
|
||||
"type": "string",
|
||||
"description": "URL to navigate to (for 'navigate' action)"
|
||||
},
|
||||
"ref": {
|
||||
"type": "integer",
|
||||
"description": "Element ref number from snapshot (for click/fill/select)"
|
||||
},
|
||||
"selector": {
|
||||
"type": "string",
|
||||
"description": "CSS selector as fallback when ref is unavailable (for click/fill/select/wait/get_text)"
|
||||
},
|
||||
"text": {
|
||||
"type": "string",
|
||||
"description": "Text to type (for 'fill' action)"
|
||||
},
|
||||
"value": {
|
||||
"type": "string",
|
||||
"description": "Option value (for 'select' action)"
|
||||
},
|
||||
"key": {
|
||||
"type": "string",
|
||||
"description": "Key to press, e.g. Enter, Tab, Escape (for 'press' action)"
|
||||
},
|
||||
"direction": {
|
||||
"type": "string",
|
||||
"description": "Scroll direction: up, down, left, right (for 'scroll' action, default: down)"
|
||||
},
|
||||
"script": {
|
||||
"type": "string",
|
||||
"description": "JavaScript code to execute (for 'evaluate' action)"
|
||||
},
|
||||
"full_page": {
|
||||
"type": "boolean",
|
||||
"description": "Capture full page screenshot (for 'screenshot' action, default: false)"
|
||||
},
|
||||
"timeout": {
|
||||
"type": "integer",
|
||||
"description": "Timeout in milliseconds (optional, default varies by action)"
|
||||
}
|
||||
},
|
||||
"required": ["action"]
|
||||
}
|
||||
|
||||
_shared_service: Optional[BrowserService] = None
|
||||
|
||||
def __init__(self, config: dict = None):
|
||||
self.config = config or {}
|
||||
self.cwd = self.config.get("cwd", os.getcwd())
|
||||
self._service: Optional[BrowserService] = None
|
||||
|
||||
def _get_service(self) -> BrowserService:
|
||||
"""Get or create the browser service, sharing across copies."""
|
||||
if self._service is not None:
|
||||
return self._service
|
||||
|
||||
# Reuse shared service across tool copies within the same session
|
||||
if BrowserTool._shared_service is not None:
|
||||
self._service = BrowserTool._shared_service
|
||||
return self._service
|
||||
|
||||
self._service = BrowserService(self.config)
|
||||
BrowserTool._shared_service = self._service
|
||||
return self._service
|
||||
|
||||
def execute(self, args: Dict[str, Any]) -> ToolResult:
|
||||
action = args.get("action", "").strip().lower()
|
||||
if not action:
|
||||
return ToolResult.fail("Error: 'action' parameter is required")
|
||||
|
||||
handler = self._ACTION_MAP.get(action)
|
||||
if not handler:
|
||||
valid = ", ".join(sorted(self._ACTION_MAP.keys()))
|
||||
return ToolResult.fail(f"Unknown action '{action}'. Valid actions: {valid}")
|
||||
|
||||
try:
|
||||
return handler(self, args)
|
||||
except Exception as e:
|
||||
logger.error(f"[Browser] Action '{action}' error: {e}")
|
||||
return ToolResult.fail(f"Browser error ({action}): {e}")
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Action handlers
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def _do_navigate(self, args: Dict[str, Any]) -> ToolResult:
|
||||
url = args.get("url", "").strip()
|
||||
if not url:
|
||||
return ToolResult.fail("Error: 'url' is required for navigate action")
|
||||
if not url.startswith(("http://", "https://")):
|
||||
url = "https://" + url
|
||||
timeout = args.get("timeout", 30000)
|
||||
service = self._get_service()
|
||||
result = service.navigate(url, timeout=timeout)
|
||||
if "error" in result:
|
||||
return ToolResult.fail(result["error"])
|
||||
# Auto-snapshot after navigation so the agent gets page content in one call
|
||||
snapshot_text = service.snapshot()
|
||||
return ToolResult.success(
|
||||
f"Navigated to: {result['url']}\nTitle: {result['title']}\nStatus: {result['status']}\n\n"
|
||||
f"--- Page Snapshot ---\n{snapshot_text}"
|
||||
)
|
||||
|
||||
def _do_snapshot(self, args: Dict[str, Any]) -> ToolResult:
|
||||
selector = args.get("selector")
|
||||
text = self._get_service().snapshot(selector=selector)
|
||||
return ToolResult.success(text)
|
||||
|
||||
def _do_click(self, args: Dict[str, Any]) -> ToolResult:
|
||||
ref = args.get("ref")
|
||||
selector = args.get("selector")
|
||||
timeout = args.get("timeout", 5000)
|
||||
result = self._get_service().click(ref=ref, selector=selector, timeout=timeout)
|
||||
if "error" in result:
|
||||
return ToolResult.fail(result["error"])
|
||||
return ToolResult.success(f"Clicked successfully. Use 'snapshot' to see updated page.")
|
||||
|
||||
def _do_fill(self, args: Dict[str, Any]) -> ToolResult:
|
||||
text = args.get("text", "")
|
||||
ref = args.get("ref")
|
||||
selector = args.get("selector")
|
||||
timeout = args.get("timeout", 5000)
|
||||
if not text and text != "":
|
||||
return ToolResult.fail("Error: 'text' is required for fill action")
|
||||
result = self._get_service().fill(text, ref=ref, selector=selector, timeout=timeout)
|
||||
if "error" in result:
|
||||
return ToolResult.fail(result["error"])
|
||||
return ToolResult.success(f"Filled text into element. Use 'snapshot' to verify.")
|
||||
|
||||
def _do_select(self, args: Dict[str, Any]) -> ToolResult:
|
||||
value = args.get("value", "")
|
||||
ref = args.get("ref")
|
||||
selector = args.get("selector")
|
||||
timeout = args.get("timeout", 5000)
|
||||
if not value:
|
||||
return ToolResult.fail("Error: 'value' is required for select action")
|
||||
result = self._get_service().select(value, ref=ref, selector=selector, timeout=timeout)
|
||||
if "error" in result:
|
||||
return ToolResult.fail(result["error"])
|
||||
return ToolResult.success(f"Selected option '{value}'.")
|
||||
|
||||
def _do_scroll(self, args: Dict[str, Any]) -> ToolResult:
|
||||
direction = args.get("direction", "down")
|
||||
amount = args.get("timeout", 500) # reuse timeout field or default
|
||||
if "amount" in args:
|
||||
amount = args["amount"]
|
||||
result = self._get_service().scroll(direction=direction, amount=amount)
|
||||
if "error" in result:
|
||||
return ToolResult.fail(result["error"])
|
||||
pos = f"scrollY={result.get('scrollY', '?')}/{result.get('scrollHeight', '?')}"
|
||||
return ToolResult.success(f"Scrolled {direction}. Position: {pos}")
|
||||
|
||||
def _do_screenshot(self, args: Dict[str, Any]) -> ToolResult:
|
||||
full_page = args.get("full_page", False)
|
||||
filepath = self._get_service().screenshot(full_page=full_page, cwd=self.cwd)
|
||||
return ToolResult.success(f"Screenshot saved to: {filepath}")
|
||||
|
||||
def _do_wait(self, args: Dict[str, Any]) -> ToolResult:
|
||||
selector = args.get("selector")
|
||||
timeout = args.get("timeout", 5000)
|
||||
result = self._get_service().wait(selector=selector, timeout=timeout)
|
||||
if "error" in result:
|
||||
return ToolResult.fail(result["error"])
|
||||
return ToolResult.success(f"Wait completed.")
|
||||
|
||||
def _do_back(self, args: Dict[str, Any]) -> ToolResult:
|
||||
result = self._get_service().go_back()
|
||||
if "error" in result:
|
||||
return ToolResult.fail(result["error"])
|
||||
return ToolResult.success(f"Navigated back to: {result['url']}")
|
||||
|
||||
def _do_forward(self, args: Dict[str, Any]) -> ToolResult:
|
||||
result = self._get_service().go_forward()
|
||||
if "error" in result:
|
||||
return ToolResult.fail(result["error"])
|
||||
return ToolResult.success(f"Navigated forward to: {result['url']}")
|
||||
|
||||
def _do_get_text(self, args: Dict[str, Any]) -> ToolResult:
|
||||
selector = args.get("selector", "").strip()
|
||||
if not selector:
|
||||
return ToolResult.fail("Error: 'selector' is required for get_text action")
|
||||
result = self._get_service().get_text(selector)
|
||||
if "error" in result:
|
||||
return ToolResult.fail(result["error"])
|
||||
return ToolResult.success(result["text"])
|
||||
|
||||
def _do_press(self, args: Dict[str, Any]) -> ToolResult:
|
||||
key = args.get("key", "").strip()
|
||||
if not key:
|
||||
return ToolResult.fail("Error: 'key' is required for press action")
|
||||
result = self._get_service().press(key)
|
||||
if "error" in result:
|
||||
return ToolResult.fail(result["error"])
|
||||
return ToolResult.success(f"Pressed key: {key}")
|
||||
|
||||
def _do_evaluate(self, args: Dict[str, Any]) -> ToolResult:
|
||||
script = args.get("script", "").strip()
|
||||
if not script:
|
||||
return ToolResult.fail("Error: 'script' is required for evaluate action")
|
||||
result = self._get_service().evaluate(script)
|
||||
if "error" in result:
|
||||
return ToolResult.fail(result["error"])
|
||||
val = result.get("result")
|
||||
if isinstance(val, (dict, list)):
|
||||
return ToolResult.success(json.dumps(val, ensure_ascii=False, indent=2))
|
||||
return ToolResult.success(str(val) if val is not None else "(no return value)")
|
||||
|
||||
# Action dispatch table
|
||||
_ACTION_MAP = {
|
||||
"navigate": _do_navigate,
|
||||
"snapshot": _do_snapshot,
|
||||
"click": _do_click,
|
||||
"fill": _do_fill,
|
||||
"select": _do_select,
|
||||
"scroll": _do_scroll,
|
||||
"screenshot": _do_screenshot,
|
||||
"wait": _do_wait,
|
||||
"back": _do_back,
|
||||
"forward": _do_forward,
|
||||
"get_text": _do_get_text,
|
||||
"press": _do_press,
|
||||
"evaluate": _do_evaluate,
|
||||
}
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Lifecycle
|
||||
# ------------------------------------------------------------------
|
||||
|
||||
def copy(self):
|
||||
"""Share browser instance across tool copies (avoids re-launching)."""
|
||||
new_tool = BrowserTool(self.config)
|
||||
new_tool.model = self.model
|
||||
new_tool.context = getattr(self, "context", None)
|
||||
new_tool.cwd = self.cwd
|
||||
new_tool._service = self._service
|
||||
return new_tool
|
||||
|
||||
def close(self):
|
||||
"""Release browser resources."""
|
||||
if self._service:
|
||||
self._service.close()
|
||||
self._service = None
|
||||
BrowserTool._shared_service = None
|
||||
logger.info("[Browser] BrowserTool closed")
|
||||
@@ -1,18 +0,0 @@
|
||||
def copy(self):
|
||||
"""
|
||||
Special copy method for browser tool to avoid recreating browser instance.
|
||||
|
||||
:return: A new instance with shared browser reference but unique model
|
||||
"""
|
||||
new_tool = self.__class__()
|
||||
|
||||
# Copy essential attributes
|
||||
new_tool.model = self.model
|
||||
new_tool.context = getattr(self, 'context', None)
|
||||
new_tool.config = getattr(self, 'config', None)
|
||||
|
||||
# Share the browser instance instead of creating a new one
|
||||
if hasattr(self, 'browser'):
|
||||
new_tool.browser = self.browser
|
||||
|
||||
return new_tool
|
||||
@@ -98,7 +98,18 @@ class Send(BaseTool):
|
||||
"size_formatted": self._format_size(file_size),
|
||||
"message": message or f"正在发送 {file_name}"
|
||||
}
|
||||
|
||||
|
||||
try:
|
||||
from common.cloud_client import get_website_base_url, copy_send_file
|
||||
|
||||
# Do nothing when in local env
|
||||
if get_website_base_url():
|
||||
url = copy_send_file(absolute_path, self.cwd)
|
||||
if url:
|
||||
result["url"] = url
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
return ToolResult.success(result)
|
||||
|
||||
def _resolve_path(self, path: str) -> str:
|
||||
|
||||
@@ -84,11 +84,11 @@ class ToolManager:
|
||||
except ImportError as e:
|
||||
# Handle missing dependencies with helpful messages
|
||||
error_msg = str(e)
|
||||
if "browser-use" in error_msg or "browser_use" in error_msg:
|
||||
if "playwright" in error_msg:
|
||||
logger.warning(
|
||||
f"[ToolManager] Browser tool not loaded - missing dependencies.\n"
|
||||
f" To enable browser tool, run:\n"
|
||||
f" pip install browser-use markdownify playwright\n"
|
||||
f" pip install playwright\n"
|
||||
f" playwright install chromium"
|
||||
)
|
||||
elif "markdownify" in error_msg:
|
||||
@@ -154,11 +154,11 @@ class ToolManager:
|
||||
except ImportError as e:
|
||||
# Handle missing dependencies with helpful messages
|
||||
error_msg = str(e)
|
||||
if "browser-use" in error_msg or "browser_use" in error_msg:
|
||||
if "playwright" in error_msg:
|
||||
logger.warning(
|
||||
f"[ToolManager] Browser tool not loaded - missing dependencies.\n"
|
||||
f" To enable browser tool, run:\n"
|
||||
f" pip install browser-use markdownify playwright\n"
|
||||
f" pip install playwright\n"
|
||||
f" playwright install chromium"
|
||||
)
|
||||
elif "markdownify" in error_msg:
|
||||
@@ -197,7 +197,7 @@ class ToolManager:
|
||||
logger.warning(
|
||||
f"[ToolManager] Browser tool is configured but not loaded.\n"
|
||||
f" To enable browser tool, run:\n"
|
||||
f" pip install browser-use markdownify playwright\n"
|
||||
f" pip install playwright\n"
|
||||
f" playwright install chromium"
|
||||
)
|
||||
elif tool_name == "google_search":
|
||||
|
||||
@@ -115,6 +115,8 @@ class AgentLLMModel(LLMModel):
|
||||
return const.QWEN_DASHSCOPE
|
||||
if model_name in [const.MOONSHOT, "moonshot-v1-8k", "moonshot-v1-32k", "moonshot-v1-128k"]:
|
||||
return const.MOONSHOT
|
||||
if conf().get("bot_type") == "modelscope":
|
||||
return const.MODELSCOPE
|
||||
for prefix, btype in self._MODEL_PREFIX_MAP:
|
||||
if model_name.startswith(prefix):
|
||||
return btype
|
||||
@@ -271,10 +273,13 @@ class AgentBridge:
|
||||
tool_manager.load_tools()
|
||||
|
||||
tools = []
|
||||
workspace_dir = kwargs.get("workspace_dir")
|
||||
for tool_name in tool_manager.tool_classes.keys():
|
||||
try:
|
||||
tool = tool_manager.create_tool(tool_name)
|
||||
if tool:
|
||||
if workspace_dir and hasattr(tool, 'cwd'):
|
||||
tool.cwd = workspace_dir
|
||||
tools.append(tool)
|
||||
except Exception as e:
|
||||
logger.warning(f"[AgentBridge] Failed to load tool {tool_name}: {e}")
|
||||
|
||||
@@ -366,7 +366,7 @@ class AgentInitializer:
|
||||
|
||||
if tool:
|
||||
# Apply workspace config to file operation tools
|
||||
if tool_name in ['read', 'write', 'edit', 'bash', 'grep', 'find', 'ls', 'web_fetch']:
|
||||
if tool_name in ['read', 'write', 'edit', 'bash', 'grep', 'find', 'ls', 'web_fetch', 'send', 'browser']:
|
||||
tool.config = file_config
|
||||
tool.cwd = file_config.get("cwd", getattr(tool, 'cwd', None))
|
||||
if 'memory_manager' in file_config:
|
||||
|
||||
@@ -455,6 +455,11 @@
|
||||
<h2 class="text-xl font-bold text-slate-800 dark:text-slate-100" data-i18n="skills_title">Skills</h2>
|
||||
<p class="text-sm text-slate-500 dark:text-slate-400 mt-1" data-i18n="skills_desc">View, enable, or disable agent skills</p>
|
||||
</div>
|
||||
<a href="https://skills.cowagent.ai/" target="_blank"
|
||||
class="inline-flex items-center gap-1.5 px-3 py-1.5 rounded-lg text-xs font-medium text-primary-500 bg-primary-50 dark:bg-primary-900/20 hover:bg-primary-100 dark:hover:bg-primary-900/30 transition-colors">
|
||||
<i class="fas fa-puzzle-piece text-[10px]"></i>
|
||||
<span data-i18n="skills_hub_btn">Skill Hub</span>
|
||||
</a>
|
||||
</div>
|
||||
|
||||
<!-- Built-in Tools Section -->
|
||||
|
||||
@@ -33,7 +33,7 @@ const I18N = {
|
||||
config_save: '保存', config_saved: '已保存',
|
||||
config_save_error: '保存失败',
|
||||
config_custom_option: '自定义...',
|
||||
skills_title: '技能管理', skills_desc: '查看、启用或禁用 Agent 技能',
|
||||
skills_title: '技能管理', skills_desc: '查看、启用或禁用 Agent 技能', skills_hub_btn: '探索技能广场',
|
||||
skills_loading: '加载技能中...', skills_loading_desc: '技能加载后将显示在此处',
|
||||
tools_section_title: '内置工具', tools_loading: '加载工具中...',
|
||||
skills_section_title: '技能', skill_enable: '启用', skill_disable: '禁用',
|
||||
@@ -56,6 +56,10 @@ const I18N = {
|
||||
weixin_scan_scanned: '已扫码,请在手机上确认', weixin_scan_expired: '二维码已过期,正在刷新...',
|
||||
weixin_scan_success: '登录成功,正在启动通道...', weixin_scan_fail: '获取二维码失败',
|
||||
weixin_qr_tip: '二维码约2分钟后过期',
|
||||
wecom_scan_btn: '扫码创建企微机器人', wecom_scan_desc: '使用企业微信扫码,一键创建智能机器人',
|
||||
wecom_scan_success: '创建成功,正在启动通道...',
|
||||
wecom_scan_fail: '创建失败',
|
||||
wecom_mode_scan: '扫码接入', wecom_mode_manual: '手动填写',
|
||||
tasks_title: '定时任务', tasks_desc: '查看和管理定时任务',
|
||||
tasks_coming: '即将推出', tasks_coming_desc: '定时任务管理功能即将在此提供',
|
||||
logs_title: '日志', logs_desc: '实时日志输出 (run.log)',
|
||||
@@ -84,7 +88,7 @@ const I18N = {
|
||||
config_save: 'Save', config_saved: 'Saved',
|
||||
config_save_error: 'Save failed',
|
||||
config_custom_option: 'Custom...',
|
||||
skills_title: 'Skills', skills_desc: 'View, enable, or disable agent skills',
|
||||
skills_title: 'Skills', skills_desc: 'View, enable, or disable agent skills', skills_hub_btn: 'Skill Hub',
|
||||
skills_loading: 'Loading skills...', skills_loading_desc: 'Skills will be displayed here after loading',
|
||||
tools_section_title: 'Built-in Tools', tools_loading: 'Loading tools...',
|
||||
skills_section_title: 'Skills', skill_enable: 'Enable', skill_disable: 'Disable',
|
||||
@@ -107,6 +111,10 @@ const I18N = {
|
||||
weixin_scan_scanned: 'Scanned, please confirm on your phone', weixin_scan_expired: 'QR code expired, refreshing...',
|
||||
weixin_scan_success: 'Login successful, starting channel...', weixin_scan_fail: 'Failed to load QR code',
|
||||
weixin_qr_tip: 'QR code expires in ~2 minutes',
|
||||
wecom_scan_btn: 'Scan to Create WeCom Bot', wecom_scan_desc: 'Scan with WeCom to create a bot instantly',
|
||||
wecom_scan_success: 'Bot created, starting channel...',
|
||||
wecom_scan_fail: 'Bot creation failed',
|
||||
wecom_mode_scan: 'Scan QR', wecom_mode_manual: 'Manual',
|
||||
tasks_title: 'Scheduled Tasks', tasks_desc: 'View and manage scheduled tasks',
|
||||
tasks_coming: 'Coming Soon', tasks_coming_desc: 'Scheduled task management will be available here',
|
||||
logs_title: 'Logs', logs_desc: 'Real-time log output (run.log)',
|
||||
@@ -464,6 +472,8 @@ let slashActiveIdx = 0;
|
||||
let slashFiltered = [];
|
||||
let slashJustSelected = false;
|
||||
let slashLastFilter = '';
|
||||
let slashLastMouseX = -1;
|
||||
let slashLastMouseY = -1;
|
||||
|
||||
function showSlashMenu(filter) {
|
||||
const q = filter.toLowerCase();
|
||||
@@ -482,6 +492,7 @@ function showSlashMenu(filter) {
|
||||
if (changed) slashActiveIdx = 0;
|
||||
slashActiveIdx = Math.min(slashActiveIdx, slashFiltered.length - 1);
|
||||
|
||||
slashNavByKeyboard = true;
|
||||
renderSlashItems();
|
||||
slashMenu.classList.remove('hidden');
|
||||
}
|
||||
@@ -492,6 +503,9 @@ function hideSlashMenu() {
|
||||
slashFiltered = [];
|
||||
slashActiveIdx = -1;
|
||||
slashLastFilter = '';
|
||||
slashNavByKeyboard = false;
|
||||
slashLastMouseX = -1;
|
||||
slashLastMouseY = -1;
|
||||
}
|
||||
|
||||
function isSlashMenuVisible() {
|
||||
@@ -507,21 +521,47 @@ function renderSlashItems() {
|
||||
`<span class="desc">${escapeHtml(c.desc)}</span></div>`
|
||||
).join('');
|
||||
|
||||
slashMenu.querySelectorAll('.slash-menu-item').forEach(el => {
|
||||
el.addEventListener('mouseenter', () => {
|
||||
slashActiveIdx = parseInt(el.dataset.idx);
|
||||
renderSlashItems();
|
||||
});
|
||||
el.addEventListener('mousedown', (e) => {
|
||||
e.preventDefault();
|
||||
selectSlashCommand(parseInt(el.dataset.idx));
|
||||
});
|
||||
});
|
||||
|
||||
const activeEl = slashMenu.querySelector('.slash-menu-item.active');
|
||||
if (activeEl) activeEl.scrollIntoView({ block: 'nearest' });
|
||||
}
|
||||
|
||||
// Delegated events on the persistent slashMenu container (not destroyed by innerHTML)
|
||||
// Use coordinate comparison to distinguish real mouse movement from DOM-rebuild phantom events.
|
||||
slashMenu.addEventListener('mousemove', (e) => {
|
||||
if (e.clientX === slashLastMouseX && e.clientY === slashLastMouseY) return;
|
||||
slashLastMouseX = e.clientX;
|
||||
slashLastMouseY = e.clientY;
|
||||
if (!slashNavByKeyboard) return;
|
||||
slashNavByKeyboard = false;
|
||||
const item = e.target.closest('.slash-menu-item');
|
||||
if (!item) return;
|
||||
const idx = parseInt(item.dataset.idx);
|
||||
if (idx === slashActiveIdx) return;
|
||||
slashActiveIdx = idx;
|
||||
slashMenu.querySelectorAll('.slash-menu-item').forEach(el => {
|
||||
el.classList.toggle('active', parseInt(el.dataset.idx) === idx);
|
||||
});
|
||||
});
|
||||
|
||||
slashMenu.addEventListener('mouseover', (e) => {
|
||||
if (slashNavByKeyboard) return;
|
||||
const item = e.target.closest('.slash-menu-item');
|
||||
if (!item) return;
|
||||
const idx = parseInt(item.dataset.idx);
|
||||
if (idx === slashActiveIdx) return;
|
||||
slashActiveIdx = idx;
|
||||
slashMenu.querySelectorAll('.slash-menu-item').forEach(el => {
|
||||
el.classList.toggle('active', parseInt(el.dataset.idx) === idx);
|
||||
});
|
||||
});
|
||||
|
||||
slashMenu.addEventListener('mousedown', (e) => {
|
||||
const item = e.target.closest('.slash-menu-item');
|
||||
if (!item) return;
|
||||
e.preventDefault();
|
||||
selectSlashCommand(parseInt(item.dataset.idx));
|
||||
});
|
||||
|
||||
function selectSlashCommand(idx) {
|
||||
if (idx < 0 || idx >= slashFiltered.length) return;
|
||||
const chosen = slashFiltered[idx].cmd;
|
||||
@@ -557,12 +597,14 @@ chatInput.addEventListener('keydown', function(e) {
|
||||
if (isSlashMenuVisible()) {
|
||||
if (e.key === 'ArrowDown') {
|
||||
e.preventDefault();
|
||||
slashNavByKeyboard = true;
|
||||
slashActiveIdx = Math.min(slashActiveIdx + 1, slashFiltered.length - 1);
|
||||
renderSlashItems();
|
||||
return;
|
||||
}
|
||||
if (e.key === 'ArrowUp') {
|
||||
e.preventDefault();
|
||||
slashNavByKeyboard = true;
|
||||
slashActiveIdx = Math.max(slashActiveIdx - 1, 0);
|
||||
renderSlashItems();
|
||||
return;
|
||||
@@ -719,6 +761,7 @@ function startSSE(requestId, loadingEl, timestamp) {
|
||||
let botEl = null;
|
||||
let stepsEl = null; // .agent-steps (thinking summaries + tool indicators)
|
||||
let contentEl = null; // .answer-content (final streaming answer)
|
||||
let mediaEl = null; // .media-content (images & file attachments)
|
||||
let accumulatedText = '';
|
||||
let currentToolEl = null;
|
||||
|
||||
@@ -734,6 +777,7 @@ function startSSE(requestId, loadingEl, timestamp) {
|
||||
<div class="bg-white dark:bg-[#1A1A1A] border border-slate-200 dark:border-white/10 rounded-2xl px-4 py-3 text-sm leading-relaxed msg-content text-slate-700 dark:text-slate-200">
|
||||
<div class="agent-steps"></div>
|
||||
<div class="answer-content sse-streaming"></div>
|
||||
<div class="media-content"></div>
|
||||
</div>
|
||||
<div class="text-xs text-slate-400 dark:text-slate-500 mt-1.5">${formatTime(timestamp)}</div>
|
||||
</div>
|
||||
@@ -741,6 +785,7 @@ function startSSE(requestId, loadingEl, timestamp) {
|
||||
messagesDiv.appendChild(botEl);
|
||||
stepsEl = botEl.querySelector('.agent-steps');
|
||||
contentEl = botEl.querySelector('.answer-content');
|
||||
mediaEl = botEl.querySelector('.media-content');
|
||||
}
|
||||
|
||||
es.onmessage = function(e) {
|
||||
@@ -831,6 +876,38 @@ function startSSE(requestId, loadingEl, timestamp) {
|
||||
currentToolEl = null;
|
||||
}
|
||||
|
||||
} else if (item.type === 'image') {
|
||||
ensureBotEl();
|
||||
const imgEl = document.createElement('img');
|
||||
imgEl.src = item.content;
|
||||
imgEl.alt = 'screenshot';
|
||||
imgEl.style.cssText = 'max-width:600px;border-radius:8px;margin:8px 0;cursor:pointer;box-shadow:0 1px 4px rgba(0,0,0,0.1);';
|
||||
imgEl.onclick = () => window.open(item.content, '_blank');
|
||||
mediaEl.appendChild(imgEl);
|
||||
scrollChatToBottom();
|
||||
|
||||
} else if (item.type === 'file') {
|
||||
ensureBotEl();
|
||||
const fileName = item.file_name || item.content.split('/').pop();
|
||||
const fileEl = document.createElement('a');
|
||||
fileEl.href = item.content;
|
||||
fileEl.download = fileName;
|
||||
fileEl.target = '_blank';
|
||||
fileEl.className = 'file-attachment';
|
||||
fileEl.style.cssText = 'display:inline-flex;align-items:center;gap:6px;padding:8px 14px;margin:8px 0;border-radius:8px;background:var(--bg-secondary,#f3f4f6);color:var(--text-primary,#374151);text-decoration:none;font-size:14px;border:1px solid var(--border-color,#e5e7eb);';
|
||||
fileEl.innerHTML = `<i class="fas fa-file-download" style="color:#6b7280;"></i> ${fileName}`;
|
||||
mediaEl.appendChild(fileEl);
|
||||
scrollChatToBottom();
|
||||
|
||||
} else if (item.type === 'phase') {
|
||||
// Coarse progress (e.g. cow install-browser); must not close SSE (unlike "done")
|
||||
ensureBotEl();
|
||||
const wrap = document.createElement('div');
|
||||
wrap.className = 'text-xs sm:text-sm text-slate-600 dark:text-slate-400 border-l-2 border-primary-400 pl-2 py-1 my-0.5';
|
||||
wrap.textContent = String(item.content || '');
|
||||
stepsEl.appendChild(wrap);
|
||||
scrollChatToBottom();
|
||||
|
||||
} else if (item.type === 'done') {
|
||||
es.close();
|
||||
delete activeStreams[requestId];
|
||||
@@ -1615,7 +1692,7 @@ function renderSkillCard(card, sk) {
|
||||
</div>
|
||||
<div class="flex-1 min-w-0">
|
||||
<div class="flex items-center gap-2 mb-1">
|
||||
<span class="font-medium text-sm text-slate-700 dark:text-slate-200 truncate flex-1">${escapeHtml(sk.name)}</span>
|
||||
<span class="font-medium text-sm text-slate-700 dark:text-slate-200 truncate flex-1">${escapeHtml(sk.display_name || sk.name)}</span>
|
||||
<button
|
||||
role="switch"
|
||||
aria-checked="${enabled}"
|
||||
@@ -1810,19 +1887,23 @@ function renderActiveChannels() {
|
||||
const hasFields = (ch.fields || []).length > 0;
|
||||
|
||||
const weixinWaiting = ch.name === 'weixin' && ch.login_status && ch.login_status !== 'logged_in';
|
||||
const wecomNeedsCreds = ch.name === 'wecom_bot' && !_wecomBotHasCreds(ch);
|
||||
let statusDot, statusText;
|
||||
if (weixinWaiting) {
|
||||
statusDot = 'bg-amber-400 animate-pulse';
|
||||
statusText = ch.login_status === 'scanned'
|
||||
? `<span class="text-xs text-primary-500">${t('weixin_scan_scanned')}</span>`
|
||||
: `<span class="text-xs text-amber-500">${t('weixin_scan_waiting')}</span>`;
|
||||
} else if (wecomNeedsCreds) {
|
||||
statusDot = 'bg-amber-400 animate-pulse';
|
||||
statusText = `<span class="text-xs text-amber-500">${t('channels_connecting')}</span>`;
|
||||
} else {
|
||||
statusDot = 'bg-primary-400';
|
||||
statusText = `<span class="text-xs text-primary-500">${t('channels_connected')}</span>`;
|
||||
}
|
||||
|
||||
card.innerHTML = `
|
||||
<div class="flex items-center gap-4${hasFields || weixinWaiting ? ' mb-5' : ''}">
|
||||
<div class="flex items-center gap-4${hasFields || weixinWaiting || wecomNeedsCreds ? ' mb-5' : ''}">
|
||||
<div class="w-10 h-10 rounded-xl bg-${ch.color}-50 dark:bg-${ch.color}-900/20 flex items-center justify-center flex-shrink-0">
|
||||
<i class="fas ${ch.icon} text-${ch.color}-500 text-base"></i>
|
||||
</div>
|
||||
@@ -1849,6 +1930,15 @@ function renderActiveChannels() {
|
||||
${t('weixin_scan_title')}
|
||||
</button>
|
||||
</div>` : ''}
|
||||
${wecomNeedsCreds ? `<div id="wecom-active-auth" class="flex flex-col items-center py-2">
|
||||
<p class="text-sm text-slate-500 dark:text-slate-400 mb-3">${t('wecom_scan_desc')}</p>
|
||||
<button onclick="startWecomBotAuthInCard()"
|
||||
class="px-5 py-2 rounded-lg bg-emerald-500 hover:bg-emerald-600 text-white text-sm font-medium
|
||||
cursor-pointer transition-colors duration-150">
|
||||
<i class="fas fa-qrcode mr-2"></i>${t('wecom_scan_btn')}
|
||||
</button>
|
||||
<div id="wecom-card-scan-status" class="mt-3"></div>
|
||||
</div>` : ''}
|
||||
${hasFields ? `<div class="space-y-4">
|
||||
${fieldsHtml}
|
||||
<div class="flex items-center justify-end gap-3 pt-1">
|
||||
@@ -2085,6 +2175,13 @@ function onAddChannelSelect(chName) {
|
||||
return;
|
||||
}
|
||||
|
||||
if (chName === 'wecom_bot') {
|
||||
actions.classList.add('hidden');
|
||||
const ch = channelsData.find(c => c.name === chName);
|
||||
fieldsContainer.innerHTML = buildWecomBotPanel(ch);
|
||||
return;
|
||||
}
|
||||
|
||||
const ch = channelsData.find(c => c.name === chName);
|
||||
if (!ch) return;
|
||||
|
||||
@@ -2306,6 +2403,191 @@ function connectWeixinAfterQr() {
|
||||
.catch(() => {});
|
||||
}
|
||||
|
||||
// =====================================================================
|
||||
// WeCom Bot QR Auth
|
||||
// =====================================================================
|
||||
const WECOM_BOT_SDK_URL = 'https://wwcdn.weixin.qq.com/node/wework/js/wecom-aibot-sdk@0.1.0.min.js';
|
||||
const WECOM_BOT_SOURCE = 'cowagent';
|
||||
let _wecomSdkLoaded = false;
|
||||
|
||||
function ensureWecomSdkLoaded() {
|
||||
return new Promise((resolve, reject) => {
|
||||
if (_wecomSdkLoaded && window.WecomAIBotSDK) { resolve(); return; }
|
||||
if (document.querySelector(`script[src="${WECOM_BOT_SDK_URL}"]`)) {
|
||||
_wecomSdkLoaded = true; resolve(); return;
|
||||
}
|
||||
const s = document.createElement('script');
|
||||
s.src = WECOM_BOT_SDK_URL;
|
||||
s.onload = () => { _wecomSdkLoaded = true; resolve(); };
|
||||
s.onerror = () => reject(new Error('Failed to load WecomAIBotSDK'));
|
||||
document.head.appendChild(s);
|
||||
});
|
||||
}
|
||||
|
||||
function _wecomBotHasCreds(ch) {
|
||||
if (!ch || !ch.fields) return false;
|
||||
const idField = ch.fields.find(f => f.key === 'wecom_bot_id');
|
||||
const secretField = ch.fields.find(f => f.key === 'wecom_bot_secret');
|
||||
return !!(idField && idField.value && secretField && secretField.value);
|
||||
}
|
||||
|
||||
function buildWecomBotPanel(ch) {
|
||||
const scanLabel = t('wecom_mode_scan');
|
||||
const manualLabel = t('wecom_mode_manual');
|
||||
const hasCreds = _wecomBotHasCreds(ch);
|
||||
const defaultMode = hasCreds ? 'manual' : 'scan';
|
||||
return `
|
||||
<div id="wecom-bot-panel" data-default-mode="${defaultMode}">
|
||||
<div class="flex items-center justify-center gap-1 mb-5 bg-slate-100 dark:bg-white/5 rounded-lg p-1">
|
||||
<button id="wecom-tab-scan" onclick="switchWecomBotMode('scan')"
|
||||
class="flex-1 px-3 py-1.5 rounded-md text-xs font-medium transition-colors
|
||||
bg-white dark:bg-slate-700 text-slate-800 dark:text-slate-100 shadow-sm">
|
||||
${scanLabel}
|
||||
</button>
|
||||
<button id="wecom-tab-manual" onclick="switchWecomBotMode('manual')"
|
||||
class="flex-1 px-3 py-1.5 rounded-md text-xs font-medium transition-colors
|
||||
text-slate-500 dark:text-slate-400 hover:text-slate-700 dark:hover:text-slate-200">
|
||||
${manualLabel}
|
||||
</button>
|
||||
</div>
|
||||
<div id="wecom-mode-content"></div>
|
||||
</div>`;
|
||||
}
|
||||
|
||||
function switchWecomBotMode(mode) {
|
||||
const scanTab = document.getElementById('wecom-tab-scan');
|
||||
const manualTab = document.getElementById('wecom-tab-manual');
|
||||
const content = document.getElementById('wecom-mode-content');
|
||||
const actions = document.getElementById('add-channel-actions');
|
||||
if (!scanTab || !manualTab || !content) return;
|
||||
|
||||
const activeClasses = 'bg-white dark:bg-slate-700 text-slate-800 dark:text-slate-100 shadow-sm';
|
||||
const inactiveClasses = 'text-slate-500 dark:text-slate-400 hover:text-slate-700 dark:hover:text-slate-200';
|
||||
|
||||
if (mode === 'scan') {
|
||||
scanTab.className = scanTab.className.replace(/text-slate-500[^\s]*/g, '').replace(/hover:\S+/g, '');
|
||||
scanTab.className = `flex-1 px-3 py-1.5 rounded-md text-xs font-medium transition-colors ${activeClasses}`;
|
||||
manualTab.className = `flex-1 px-3 py-1.5 rounded-md text-xs font-medium transition-colors ${inactiveClasses}`;
|
||||
actions.classList.add('hidden');
|
||||
content.innerHTML = `
|
||||
<div class="flex flex-col items-center py-4">
|
||||
<p class="text-sm text-slate-600 dark:text-slate-300 mb-2">${t('wecom_scan_desc')}</p>
|
||||
<button onclick="startWecomBotAuth()"
|
||||
class="mt-3 px-6 py-2.5 rounded-lg bg-emerald-500 hover:bg-emerald-600 text-white text-sm font-medium
|
||||
cursor-pointer transition-colors duration-150">
|
||||
<i class="fas fa-qrcode mr-2"></i>${t('wecom_scan_btn')}
|
||||
</button>
|
||||
<div id="wecom-scan-status" class="mt-3"></div>
|
||||
</div>`;
|
||||
} else {
|
||||
manualTab.className = `flex-1 px-3 py-1.5 rounded-md text-xs font-medium transition-colors ${activeClasses}`;
|
||||
scanTab.className = `flex-1 px-3 py-1.5 rounded-md text-xs font-medium transition-colors ${inactiveClasses}`;
|
||||
const ch = channelsData.find(c => c.name === 'wecom_bot');
|
||||
content.innerHTML = `<div class="space-y-4">${buildChannelFieldsHtml('wecom_bot', ch ? ch.fields || [] : [])}</div>`;
|
||||
bindSecretFieldEvents(content);
|
||||
actions.classList.remove('hidden');
|
||||
}
|
||||
}
|
||||
|
||||
function startWecomBotAuth() {
|
||||
const statusEl = document.getElementById('wecom-scan-status');
|
||||
ensureWecomSdkLoaded().then(() => {
|
||||
WecomAIBotSDK.openBotInfoAuthWindow({
|
||||
source: WECOM_BOT_SOURCE,
|
||||
onCreated: function(bot) {
|
||||
if (statusEl) {
|
||||
statusEl.innerHTML = `
|
||||
<div class="flex flex-col items-center py-2">
|
||||
<div class="w-10 h-10 rounded-full bg-emerald-50 dark:bg-emerald-900/30 flex items-center justify-center mb-2">
|
||||
<i class="fas fa-check text-emerald-500 text-lg"></i>
|
||||
</div>
|
||||
<p class="text-sm font-medium text-emerald-600 dark:text-emerald-400">${t('wecom_scan_success')}</p>
|
||||
</div>`;
|
||||
}
|
||||
connectWecomBotAfterAuth(bot.botid, bot.secret);
|
||||
},
|
||||
onError: function(err) {
|
||||
if (statusEl) {
|
||||
statusEl.innerHTML = `<p class="text-sm text-red-500">${t('wecom_scan_fail')}: ${err.message || err.code || ''}</p>`;
|
||||
}
|
||||
}
|
||||
});
|
||||
}).catch(err => {
|
||||
if (statusEl) {
|
||||
statusEl.innerHTML = `<p class="text-sm text-red-500">SDK load failed: ${err.message}</p>`;
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
function connectWecomBotAfterAuth(botId, secret) {
|
||||
fetch('/api/channels', {
|
||||
method: 'POST',
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: JSON.stringify({
|
||||
action: 'connect',
|
||||
channel: 'wecom_bot',
|
||||
config: { wecom_bot_id: botId, wecom_bot_secret: secret }
|
||||
})
|
||||
})
|
||||
.then(r => r.json())
|
||||
.then(data => {
|
||||
if (data.status === 'success') {
|
||||
const ch = channelsData.find(c => c.name === 'wecom_bot');
|
||||
if (ch) {
|
||||
ch.active = true;
|
||||
(ch.fields || []).forEach(f => {
|
||||
if (f.key === 'wecom_bot_id') f.value = botId;
|
||||
if (f.key === 'wecom_bot_secret') f.value = ChannelsHandler_maskSecret(secret);
|
||||
});
|
||||
}
|
||||
setTimeout(() => renderActiveChannels(), 1500);
|
||||
}
|
||||
})
|
||||
.catch(() => {});
|
||||
}
|
||||
|
||||
function startWecomBotAuthInCard() {
|
||||
const statusEl = document.getElementById('wecom-card-scan-status');
|
||||
ensureWecomSdkLoaded().then(() => {
|
||||
WecomAIBotSDK.openBotInfoAuthWindow({
|
||||
source: WECOM_BOT_SOURCE,
|
||||
onCreated: function(bot) {
|
||||
if (statusEl) {
|
||||
statusEl.innerHTML = `
|
||||
<div class="flex flex-col items-center py-2">
|
||||
<div class="w-10 h-10 rounded-full bg-emerald-50 dark:bg-emerald-900/30 flex items-center justify-center mb-2">
|
||||
<i class="fas fa-check text-emerald-500 text-lg"></i>
|
||||
</div>
|
||||
<p class="text-sm font-medium text-emerald-600 dark:text-emerald-400">${t('wecom_scan_success')}</p>
|
||||
</div>`;
|
||||
}
|
||||
connectWecomBotAfterAuth(bot.botid, bot.secret);
|
||||
},
|
||||
onError: function(err) {
|
||||
if (statusEl) {
|
||||
statusEl.innerHTML = `<p class="text-sm text-red-500">${t('wecom_scan_fail')}: ${err.message || err.code || ''}</p>`;
|
||||
}
|
||||
}
|
||||
});
|
||||
}).catch(err => {
|
||||
if (statusEl) {
|
||||
statusEl.innerHTML = `<p class="text-sm text-red-500">SDK load failed: ${err.message}</p>`;
|
||||
}
|
||||
});
|
||||
}
|
||||
|
||||
// Initialize wecom bot panel with correct default mode when inserted into DOM
|
||||
document.addEventListener('DOMContentLoaded', function() {
|
||||
const observer = new MutationObserver(function() {
|
||||
const panel = document.getElementById('wecom-bot-panel');
|
||||
if (panel && !panel.dataset.initialized) {
|
||||
panel.dataset.initialized = '1';
|
||||
switchWecomBotMode(panel.dataset.defaultMode || 'scan');
|
||||
}
|
||||
});
|
||||
observer.observe(document.body, { childList: true, subtree: true });
|
||||
});
|
||||
|
||||
// =====================================================================
|
||||
// Scheduler View
|
||||
// =====================================================================
|
||||
|
||||
@@ -96,9 +96,36 @@ class WebChannel(ChatChannel):
|
||||
logger.error(f"No session_id found for request {request_id}")
|
||||
return
|
||||
|
||||
# SSE mode: push done event to SSE queue
|
||||
# SSE mode: push events to SSE queue
|
||||
if request_id in self.sse_queues:
|
||||
content = reply.content if reply.content is not None else ""
|
||||
|
||||
# Intermediate status lines (e.g. /install-browser phases) must NOT use "done",
|
||||
# or the frontend closes EventSource and drops subsequent events.
|
||||
if getattr(reply, "sse_phase", False):
|
||||
self.sse_queues[request_id].put({
|
||||
"type": "phase",
|
||||
"content": content,
|
||||
"request_id": request_id,
|
||||
"timestamp": time.time(),
|
||||
})
|
||||
logger.debug(f"SSE phase for request {request_id}")
|
||||
return
|
||||
|
||||
# Files are already pushed via on_event (file_to_send) during agent execution.
|
||||
# Skip duplicate file pushes here; just let the done event through.
|
||||
if reply.type in (ReplyType.IMAGE_URL, ReplyType.FILE) and content.startswith("file://"):
|
||||
text_content = getattr(reply, 'text_content', '')
|
||||
if text_content:
|
||||
self.sse_queues[request_id].put({
|
||||
"type": "done",
|
||||
"content": text_content,
|
||||
"request_id": request_id,
|
||||
"timestamp": time.time()
|
||||
})
|
||||
logger.debug(f"SSE skipped duplicate file for request {request_id}")
|
||||
return
|
||||
|
||||
self.sse_queues[request_id].put({
|
||||
"type": "done",
|
||||
"content": content,
|
||||
@@ -161,6 +188,19 @@ class WebChannel(ChatChannel):
|
||||
"execution_time": round(exec_time, 2)
|
||||
})
|
||||
|
||||
elif event_type == "file_to_send":
|
||||
file_path = data.get("path", "")
|
||||
file_name = data.get("file_name", os.path.basename(file_path))
|
||||
file_type = data.get("file_type", "file")
|
||||
from urllib.parse import quote
|
||||
web_url = f"/api/file?path={quote(file_path)}"
|
||||
is_image = file_type == "image"
|
||||
q.put({
|
||||
"type": "image" if is_image else "file",
|
||||
"content": web_url,
|
||||
"file_name": file_name,
|
||||
})
|
||||
|
||||
return on_event
|
||||
|
||||
def upload_file(self):
|
||||
@@ -377,6 +417,7 @@ class WebChannel(ChatChannel):
|
||||
'/message', 'MessageHandler',
|
||||
'/upload', 'UploadHandler',
|
||||
'/uploads/(.*)', 'UploadsHandler',
|
||||
'/api/file', 'FileServeHandler',
|
||||
'/poll', 'PollHandler',
|
||||
'/stream', 'StreamHandler',
|
||||
'/chat', 'ChatHandler',
|
||||
@@ -463,6 +504,32 @@ class UploadsHandler:
|
||||
raise web.notfound()
|
||||
|
||||
|
||||
class FileServeHandler:
|
||||
def GET(self):
|
||||
"""Serve a local file by absolute path (for agent send tool)."""
|
||||
try:
|
||||
params = web.input(path="")
|
||||
file_path = params.path
|
||||
if not file_path or not os.path.isabs(file_path):
|
||||
raise web.notfound()
|
||||
file_path = os.path.normpath(file_path)
|
||||
if not os.path.isfile(file_path):
|
||||
raise web.notfound()
|
||||
content_type = mimetypes.guess_type(file_path)[0] or "application/octet-stream"
|
||||
file_name = os.path.basename(file_path)
|
||||
from urllib.parse import quote
|
||||
web.header('Content-Type', content_type)
|
||||
web.header('Content-Disposition', f"inline; filename*=UTF-8''{quote(file_name)}")
|
||||
web.header('Cache-Control', 'public, max-age=3600')
|
||||
with open(file_path, 'rb') as f:
|
||||
return f.read()
|
||||
except web.HTTPError:
|
||||
raise
|
||||
except Exception as e:
|
||||
logger.error(f"[WebChannel] Error serving file: {e}")
|
||||
raise web.notfound()
|
||||
|
||||
|
||||
class PollHandler:
|
||||
def POST(self):
|
||||
return WebChannel().poll_response()
|
||||
@@ -569,6 +636,13 @@ class ConfigHandler:
|
||||
"api_base_default": "https://api.deepseek.com/v1",
|
||||
"models": [const.DEEPSEEK_CHAT, const.DEEPSEEK_REASONER],
|
||||
}),
|
||||
("modelscope", {
|
||||
"label": "ModelScope",
|
||||
"api_key_field": "modelscope_api_key",
|
||||
"api_base_key": None,
|
||||
"api_base_default": None,
|
||||
"models": [const.QWEN3_5_27B, const.QWEN3_235B_A22B_INSTRUCT_2507],
|
||||
}),
|
||||
("linkai", {
|
||||
"label": "LinkAI",
|
||||
"api_key_field": "linkai_api_key",
|
||||
@@ -1291,6 +1365,8 @@ class MemoryContentHandler:
|
||||
service = MemoryService(workspace_root)
|
||||
result = service.get_content(params.filename)
|
||||
return json.dumps({"status": "success", **result}, ensure_ascii=False)
|
||||
except ValueError:
|
||||
return json.dumps({"status": "error", "message": "invalid filename"})
|
||||
except FileNotFoundError:
|
||||
return json.dumps({"status": "error", "message": "file not found"})
|
||||
except Exception as e:
|
||||
|
||||
@@ -330,28 +330,42 @@ class WecomBotChannel(ChatChannel):
|
||||
|
||||
All intermediate segments (thinking before tool calls) and the final answer
|
||||
are accumulated into a single stream message, separated by '---'.
|
||||
Throttles push to at most once per 100ms to avoid WebSocket congestion.
|
||||
"""
|
||||
stream_id = uuid.uuid4().hex[:16]
|
||||
self._stream_states[req_id] = {
|
||||
"stream_id": stream_id,
|
||||
"committed": "", # finalized content from previous segments
|
||||
"current": "", # current segment being streamed
|
||||
"committed": "",
|
||||
"current": "",
|
||||
"last_push_time": 0,
|
||||
"last_push_len": 0,
|
||||
}
|
||||
|
||||
def _push_stream(state: dict):
|
||||
"""Push current stream content to wecom."""
|
||||
self._ws_send({
|
||||
"cmd": "aibot_respond_msg",
|
||||
"headers": {"req_id": req_id},
|
||||
"body": {
|
||||
"msgtype": "stream",
|
||||
"stream": {
|
||||
"id": state["stream_id"],
|
||||
"finish": False,
|
||||
"content": state["committed"] + state["current"],
|
||||
def _push_stream(state: dict, force: bool = False):
|
||||
"""Push current stream content to wecom (throttled unless forced)."""
|
||||
now = time.time()
|
||||
if not force and now - state["last_push_time"] < 0.1:
|
||||
return
|
||||
content = state["committed"] + state["current"]
|
||||
if len(content) == state["last_push_len"]:
|
||||
return
|
||||
state["last_push_time"] = now
|
||||
state["last_push_len"] = len(content)
|
||||
try:
|
||||
self._ws_send({
|
||||
"cmd": "aibot_respond_msg",
|
||||
"headers": {"req_id": req_id},
|
||||
"body": {
|
||||
"msgtype": "stream",
|
||||
"stream": {
|
||||
"id": state["stream_id"],
|
||||
"finish": False,
|
||||
"content": content,
|
||||
},
|
||||
},
|
||||
},
|
||||
})
|
||||
})
|
||||
except Exception as e:
|
||||
logger.warning(f"[WecomBot] Stream push failed: {e}")
|
||||
|
||||
def on_event(event: dict):
|
||||
event_type = event.get("type")
|
||||
@@ -378,6 +392,7 @@ class WecomBotChannel(ChatChannel):
|
||||
else:
|
||||
state["committed"] += state["current"]
|
||||
state["current"] = ""
|
||||
_push_stream(state, force=True)
|
||||
|
||||
return on_event
|
||||
|
||||
@@ -452,11 +467,16 @@ class WecomBotChannel(ChatChannel):
|
||||
if req_id:
|
||||
state = self._stream_states.pop(req_id, None)
|
||||
if state:
|
||||
final_content = state["committed"]
|
||||
final_content = state["committed"] if state["committed"] else content
|
||||
stream_id = state["stream_id"]
|
||||
else:
|
||||
final_content = content
|
||||
stream_id = uuid.uuid4().hex[:16]
|
||||
|
||||
# Brief pause so the server finishes processing the last intermediate chunk
|
||||
# before receiving the finish packet
|
||||
time.sleep(0.15)
|
||||
|
||||
self._ws_send({
|
||||
"cmd": "aibot_respond_msg",
|
||||
"headers": {"req_id": req_id},
|
||||
|
||||
@@ -303,13 +303,18 @@ def upload_media_to_cdn(api: WeixinApi, file_path: str, to_user_id: str,
|
||||
filesize=cipher_size,
|
||||
aeskey=aes_key_hex,
|
||||
)
|
||||
upload_param = resp.get("upload_param", "")
|
||||
if not upload_param:
|
||||
raise RuntimeError(f"[Weixin] getUploadUrl returned no upload_param: {resp}")
|
||||
|
||||
cdn_url = (f"{api.cdn_base_url}/upload"
|
||||
f"?encrypted_query_param={quote(upload_param)}"
|
||||
f"&filekey={quote(filekey)}")
|
||||
# API may return either upload_full_url (new) or upload_param (legacy)
|
||||
upload_full_url = resp.get("upload_full_url", "")
|
||||
upload_param = resp.get("upload_param", "")
|
||||
if upload_full_url:
|
||||
cdn_url = upload_full_url
|
||||
elif upload_param:
|
||||
cdn_url = (f"{api.cdn_base_url}/upload"
|
||||
f"?encrypted_query_param={quote(upload_param)}"
|
||||
f"&filekey={quote(filekey)}")
|
||||
else:
|
||||
raise RuntimeError(f"[Weixin] getUploadUrl returned neither upload_full_url nor upload_param: {resp}")
|
||||
|
||||
cdn_resp = requests.post(cdn_url, data=encrypted, headers={
|
||||
"Content-Type": "application/octet-stream",
|
||||
|
||||
@@ -1 +1 @@
|
||||
2.0.4
|
||||
2.0.5
|
||||
|
||||
@@ -5,6 +5,7 @@ from cli import __version__
|
||||
from cli.commands.skill import skill
|
||||
from cli.commands.process import start, stop, restart, update, status, logs
|
||||
from cli.commands.context import context
|
||||
from cli.commands.install import install_browser
|
||||
|
||||
|
||||
HELP_TEXT = """Usage: cow COMMAND [ARGS]...
|
||||
@@ -21,6 +22,7 @@ Commands:
|
||||
status Show CowAgent running status.
|
||||
logs View CowAgent logs.
|
||||
skill Manage CowAgent skills.
|
||||
install-browser Install browser tool (Playwright + Chromium).
|
||||
|
||||
Tip: You can also send /help, /skill list, etc. in agent chat."""
|
||||
|
||||
@@ -67,6 +69,7 @@ main.add_command(update)
|
||||
main.add_command(status)
|
||||
main.add_command(logs)
|
||||
main.add_command(context)
|
||||
main.add_command(install_browser)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
|
||||
259
cli/commands/install.py
Normal file
259
cli/commands/install.py
Normal file
@@ -0,0 +1,259 @@
|
||||
"""cow install-browser - Install Playwright + Chromium for the browser tool."""
|
||||
|
||||
import os
|
||||
import sys
|
||||
import subprocess
|
||||
from typing import Callable, Optional
|
||||
|
||||
import click
|
||||
|
||||
PLAYWRIGHT_VERSION = "1.52.0"
|
||||
PLAYWRIGHT_LEGACY_VERSION = "1.28.0"
|
||||
GLIBC_THRESHOLD = (2, 28)
|
||||
CHINA_MIRROR = "https://registry.npmmirror.com/-/binary/playwright"
|
||||
|
||||
# stream(msg, fg=None) — fg is "yellow" | "green" | "red" | None
|
||||
StreamFn = Callable[[str, Optional[str]], None]
|
||||
# on_phase(msg) — coarse-grained progress for chat channels (Chinese)
|
||||
PhaseFn = Callable[[str], None]
|
||||
|
||||
|
||||
def _phase(cb: Optional[PhaseFn], msg: str) -> None:
|
||||
if cb:
|
||||
cb(msg)
|
||||
|
||||
|
||||
def _has_display() -> bool:
|
||||
"""Check if a graphical display is available (Linux only)."""
|
||||
return bool(os.environ.get("DISPLAY") or os.environ.get("WAYLAND_DISPLAY"))
|
||||
|
||||
|
||||
def _is_headless_linux() -> bool:
|
||||
return sys.platform == "linux" and not _has_display()
|
||||
|
||||
|
||||
def _get_installed_version() -> str:
|
||||
try:
|
||||
out = subprocess.check_output(
|
||||
[sys.executable, "-c", "import playwright; print(playwright.__version__)"],
|
||||
stderr=subprocess.DEVNULL,
|
||||
)
|
||||
return out.decode().strip()
|
||||
except Exception:
|
||||
return ""
|
||||
|
||||
|
||||
def _version_tuple(v: str):
|
||||
try:
|
||||
return tuple(int(x) for x in v.split(".")[:3])
|
||||
except (ValueError, AttributeError):
|
||||
return (0, 0, 0)
|
||||
|
||||
|
||||
def _get_glibc_version():
|
||||
if sys.platform != "linux":
|
||||
return None
|
||||
try:
|
||||
import ctypes
|
||||
libc = ctypes.CDLL("libc.so.6")
|
||||
gnu_get_libc_version = libc.gnu_get_libc_version
|
||||
gnu_get_libc_version.restype = ctypes.c_char_p
|
||||
ver = gnu_get_libc_version().decode()
|
||||
parts = ver.split(".")
|
||||
return (int(parts[0]), int(parts[1]))
|
||||
except Exception:
|
||||
return None
|
||||
|
||||
|
||||
def _is_china_network() -> bool:
|
||||
try:
|
||||
out = subprocess.check_output(
|
||||
[sys.executable, "-m", "pip", "config", "get", "global.index-url"],
|
||||
stderr=subprocess.DEVNULL,
|
||||
)
|
||||
url = out.decode().strip().lower()
|
||||
return any(kw in url for kw in ("tsinghua", "aliyun", "npmmirror", "douban", "ustc", "huawei", "tencentyun"))
|
||||
except Exception:
|
||||
return False
|
||||
|
||||
|
||||
def _pip_install(package_spec: str, stream: StreamFn) -> int:
|
||||
"""Install a package, retrying with --user on permission failure."""
|
||||
python = sys.executable
|
||||
ret = subprocess.call([python, "-m", "pip", "install", package_spec])
|
||||
if ret != 0:
|
||||
stream(" Retrying with --user flag...", "yellow")
|
||||
ret = subprocess.call([python, "-m", "pip", "install", "--user", package_spec])
|
||||
return ret
|
||||
|
||||
|
||||
def _default_stream(msg: str, fg: Optional[str] = None) -> None:
|
||||
"""CLI: colored click output."""
|
||||
if fg == "yellow":
|
||||
click.echo(click.style(msg, fg="yellow"))
|
||||
elif fg == "green":
|
||||
click.echo(click.style(msg, fg="green"))
|
||||
elif fg == "red":
|
||||
click.echo(click.style(msg, fg="red"))
|
||||
else:
|
||||
click.echo(msg)
|
||||
|
||||
|
||||
def run_install_browser(
|
||||
stream: Optional[StreamFn] = None,
|
||||
on_phase: Optional[PhaseFn] = None,
|
||||
) -> int:
|
||||
"""
|
||||
Install Playwright Python package, optional Linux deps, and Chromium.
|
||||
|
||||
Reused by ``cow install-browser`` CLI and chat ``/install-browser``.
|
||||
|
||||
Args:
|
||||
stream: Optional callback ``(message, fg)`` for each line. ``fg`` is
|
||||
``yellow`` / ``green`` / ``red`` or None. Defaults to colored click output.
|
||||
on_phase: Optional callback for coarse progress (e.g. push to chat);
|
||||
messages are short Chinese status lines.
|
||||
|
||||
Returns:
|
||||
0 on success, 1 on fatal failure (pip or chromium install failed).
|
||||
"""
|
||||
stream = stream or _default_stream
|
||||
python = sys.executable
|
||||
legacy_mode = False
|
||||
|
||||
_phase(on_phase, "🔧 开始安装浏览器工具依赖(约几分钟,请耐心等待)…")
|
||||
|
||||
glibc = _get_glibc_version()
|
||||
if glibc and glibc < GLIBC_THRESHOLD:
|
||||
legacy_mode = True
|
||||
glibc_str = f"{glibc[0]}.{glibc[1]}"
|
||||
stream(
|
||||
f"glibc {glibc_str} detected (< 2.28). "
|
||||
f"Will install playwright {PLAYWRIGHT_LEGACY_VERSION} for compatibility.",
|
||||
"yellow",
|
||||
)
|
||||
stream(" Note: upgrade your OS for full browser tool support.", "yellow")
|
||||
stream("")
|
||||
_phase(
|
||||
on_phase,
|
||||
f"ℹ️ 检测到 glibc {glibc_str}(较旧),将安装兼容版 Playwright {PLAYWRIGHT_LEGACY_VERSION}。",
|
||||
)
|
||||
|
||||
target_version = PLAYWRIGHT_LEGACY_VERSION if legacy_mode else PLAYWRIGHT_VERSION
|
||||
|
||||
_phase(on_phase, "📦 [1/3] 正在安装 Playwright Python 包…")
|
||||
stream("[1/3] Installing playwright Python package...", "yellow")
|
||||
ret = _pip_install(f"playwright=={target_version}", stream)
|
||||
if ret != 0:
|
||||
stream("Failed to install playwright package.", "red")
|
||||
_phase(on_phase, "❌ [1/3] Playwright Python 包安装失败。")
|
||||
return 1
|
||||
|
||||
installed = _get_installed_version()
|
||||
if installed:
|
||||
stream(f" playwright {installed} installed.", "green")
|
||||
stream("")
|
||||
_phase(on_phase, f"✅ [1/3] Playwright 包已安装({installed or target_version})。")
|
||||
|
||||
if sys.platform == "linux":
|
||||
_phase(on_phase, "🔧 [2/3] 正在安装 Linux 系统依赖与轻量中文字体(文泉驿正黑,部分步骤可能需要 sudo)…")
|
||||
stream("[2/3] Installing system dependencies (Linux)...", "yellow")
|
||||
ret = subprocess.call([python, "-m", "playwright", "install-deps", "chromium"])
|
||||
if ret != 0:
|
||||
stream(
|
||||
" Could not auto-install system deps (may need sudo).\n"
|
||||
f" Run manually: sudo {python} -m playwright install-deps chromium",
|
||||
"yellow",
|
||||
)
|
||||
# Prefer fonts-wqy-zenhei only (~few MB). fonts-noto-cjk is much larger (~150MB+).
|
||||
stream(" Installing CJK font (fonts-wqy-zenhei, lightweight)...")
|
||||
font_ret = subprocess.call(
|
||||
["sudo", "apt-get", "install", "-y", "--no-install-recommends", "fonts-wqy-zenhei"],
|
||||
stderr=subprocess.DEVNULL,
|
||||
)
|
||||
if font_ret != 0:
|
||||
stream(
|
||||
" Could not auto-install CJK font.\n"
|
||||
" Run manually: sudo apt-get install -y fonts-wqy-zenhei\n"
|
||||
" (Optional, larger full coverage: sudo apt-get install -y fonts-noto-cjk)",
|
||||
"yellow",
|
||||
)
|
||||
else:
|
||||
subprocess.call(["fc-cache", "-fv"], stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
|
||||
stream(" CJK font (wqy-zenhei) installed.", "green")
|
||||
_phase(
|
||||
on_phase,
|
||||
"✅ [2/3] Linux 依赖与字体步骤已执行(若有权限问题请查看服务器日志或手动执行提示命令)。",
|
||||
)
|
||||
else:
|
||||
stream(f"[2/3] Skipping system deps (not needed on {sys.platform}).", "yellow")
|
||||
_phase(on_phase, f"ℹ️ [2/3] 当前系统({sys.platform})跳过 Linux 专用依赖。")
|
||||
stream("")
|
||||
|
||||
_phase(on_phase, "🌐 [3/3] 正在下载并安装 Chromium(体积较大,请耐心等待)…")
|
||||
stream("[3/3] Installing Chromium browser...", "yellow")
|
||||
cmd = [python, "-m", "playwright", "install", "chromium"]
|
||||
|
||||
if _is_headless_linux() and not legacy_mode:
|
||||
ver = _version_tuple(installed or "")
|
||||
if ver >= (1, 57, 0):
|
||||
cmd.append("--only-shell")
|
||||
stream(" (headless shell for Linux server)", None)
|
||||
else:
|
||||
stream(" (full Chromium)", None)
|
||||
elif sys.platform == "linux" and _has_display():
|
||||
stream(" (full browser for Linux desktop)", None)
|
||||
|
||||
env = os.environ.copy()
|
||||
use_mirror = _is_china_network()
|
||||
if use_mirror:
|
||||
env["PLAYWRIGHT_DOWNLOAD_HOST"] = CHINA_MIRROR
|
||||
stream(f" (using China mirror: {CHINA_MIRROR})", None)
|
||||
_phase(on_phase, "📡 检测到国内 pip 源配置,Chromium 将优先走国内镜像下载。")
|
||||
|
||||
ret = subprocess.call(cmd, env=env)
|
||||
|
||||
if ret != 0 and use_mirror:
|
||||
stream(" Mirror download failed, retrying with official CDN...", "yellow")
|
||||
_phase(on_phase, "⚠️ 镜像下载失败,正在改用官方源重试…")
|
||||
env_no_mirror = os.environ.copy()
|
||||
env_no_mirror.pop("PLAYWRIGHT_DOWNLOAD_HOST", None)
|
||||
ret = subprocess.call(cmd, env=env_no_mirror)
|
||||
|
||||
if ret != 0:
|
||||
stream("Failed to install Chromium.", "red")
|
||||
_phase(on_phase, "❌ [3/3] Chromium 安装失败。")
|
||||
return 1
|
||||
|
||||
stream("")
|
||||
_phase(on_phase, "✅ [3/3] Chromium 已安装。")
|
||||
|
||||
stream("Verifying browser installation...", None)
|
||||
_phase(on_phase, "🔍 正在验证 Playwright 能否正常加载…")
|
||||
ret = subprocess.call(
|
||||
[python, "-c", "from playwright.sync_api import sync_playwright; print('OK')"],
|
||||
stderr=subprocess.DEVNULL,
|
||||
)
|
||||
if ret != 0:
|
||||
stream(
|
||||
" Warning: playwright import failed. Browser tool may not work on this system.\n"
|
||||
" Consider upgrading your OS or using Docker.",
|
||||
"yellow",
|
||||
)
|
||||
_phase(on_phase, "⚠️ 验证未完全通过:本机可能仍无法使用浏览器工具,请查看日志或升级系统。")
|
||||
else:
|
||||
stream(" Verification passed.", "green")
|
||||
_phase(on_phase, "✅ 验证通过。")
|
||||
|
||||
stream("")
|
||||
stream("Browser tool ready! Restart CowAgent to enable it.", "green")
|
||||
_phase(on_phase, "🎉 全部步骤结束。请重启 CowAgent 后使用 browser 工具。")
|
||||
return 0
|
||||
|
||||
|
||||
@click.command("install-browser")
|
||||
def install_browser():
|
||||
"""Install browser tool dependencies (Playwright + Chromium)."""
|
||||
code = run_install_browser()
|
||||
if code != 0:
|
||||
raise SystemExit(code)
|
||||
@@ -206,10 +206,10 @@ def update(ctx):
|
||||
cwd=root,
|
||||
)
|
||||
|
||||
# 4. Start service
|
||||
# 4. Start service and follow logs
|
||||
click.echo("")
|
||||
time.sleep(1)
|
||||
ctx.invoke(start, no_logs=True)
|
||||
ctx.invoke(start, no_logs=False)
|
||||
|
||||
|
||||
@click.command()
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -487,6 +487,19 @@ class CloudClient(LinkAIClient):
|
||||
session_id = f"session_{session_id}"
|
||||
logger.info(f"[CloudClient] on_chat: session={session_id}, channel={channel_type}, query={query[:80]}")
|
||||
|
||||
# Intercept cow/slash commands before the agent runs
|
||||
try:
|
||||
from plugins import PluginManager
|
||||
mgr = PluginManager()
|
||||
instance = mgr.instances.get("COW_CLI")
|
||||
if instance and hasattr(instance, "execute"):
|
||||
result = instance.execute(query, session_id=session_id)
|
||||
if result is not None:
|
||||
send_chunk_fn({"chunk_type": "content", "delta": result, "segment_id": 0})
|
||||
return
|
||||
except Exception as e:
|
||||
logger.warning(f"[CloudClient] cow_cli intercept failed: {e}")
|
||||
|
||||
svc = self.chat_service
|
||||
if svc is None:
|
||||
raise RuntimeError("ChatService not available")
|
||||
@@ -629,9 +642,9 @@ def get_deployment_id() -> str:
|
||||
|
||||
|
||||
def get_website_base_url() -> str:
|
||||
"""Return the public URL prefix that maps to the workspace websites/ dir.
|
||||
"""Return the URL prefix that maps to the workspace websites/ dir.
|
||||
|
||||
Returns empty string when cloud deployment is not configured.
|
||||
Do nothing when in local env.
|
||||
"""
|
||||
deployment_id = get_deployment_id()
|
||||
if not deployment_id:
|
||||
@@ -648,6 +661,42 @@ def get_website_base_url() -> str:
|
||||
return f"https://app.{domain}/{deployment_id}"
|
||||
|
||||
|
||||
# Subdir under websites/ used by the send tool
|
||||
COW_SEND_WEB_SUBDIR = "cow-send"
|
||||
|
||||
|
||||
def copy_send_file(src_path: str, workspace_root: str) -> str:
|
||||
"""Copy *src_path* into ``websites/cow-send/`` and return its URL.
|
||||
|
||||
Returns empty string in local env.
|
||||
"""
|
||||
import shutil
|
||||
import uuid
|
||||
|
||||
from common.utils import expand_path
|
||||
|
||||
base = get_website_base_url()
|
||||
if not base or not src_path or not os.path.isfile(src_path):
|
||||
return ""
|
||||
ws = os.path.abspath(expand_path(workspace_root))
|
||||
send_dir = os.path.join(ws, "websites", COW_SEND_WEB_SUBDIR)
|
||||
try:
|
||||
os.makedirs(send_dir, exist_ok=True)
|
||||
except OSError:
|
||||
return ""
|
||||
ext = os.path.splitext(src_path)[1].lower()
|
||||
if len(ext) > 12 or not ext.replace(".", "").isalnum():
|
||||
ext = ""
|
||||
dest_name = f"{uuid.uuid4().hex}{ext}"
|
||||
dest_path = os.path.join(send_dir, dest_name)
|
||||
try:
|
||||
shutil.copy2(src_path, dest_path)
|
||||
except OSError as e:
|
||||
logger.warning(f"[cloud] copy_send_file: copy failed: {e}")
|
||||
return ""
|
||||
return f"{base}/{COW_SEND_WEB_SUBDIR}/{dest_name}"
|
||||
|
||||
|
||||
def build_website_prompt(workspace_dir: str) -> list:
|
||||
"""Build system prompt lines for cloud website/file sharing rules.
|
||||
|
||||
@@ -668,8 +717,8 @@ def build_website_prompt(workspace_dir: str) -> list:
|
||||
f" - 例如: `websites/my-app/index.html` → `{base_url}/my-app/index.html`",
|
||||
"",
|
||||
"2. **生成文件分享** (PPT、PDF、图片、音视频等): 当你为用户生成了需要下载或查看的文件时,**可以**将文件保存到 `websites/` 目录中",
|
||||
f" - 例如: 生成的PPT保存到 `websites/files/report.pptx` → 下载链接为 `{base_url}/files/report.pptx`",
|
||||
" - 你仍然可以同时使用 `send` 工具发送文件(在飞书、钉钉等IM渠道中有效),但**必须同时在回复文本中提供下载链接**作为兜底,因为部分渠道(如网页端)无法通过 send 接收本地文件",
|
||||
f" - 例如: 生成的PPT保存到 `websites/files/report.pptx` → 下载链接为 `{base_url}/files/report.pptx`",
|
||||
" - 你仍然可以同时使用 `send` 工具发送文件(在微信、飞书、钉钉、web等渠道中有效),但**必须同时在回复文本中提供下载链接**作为兜底,因为部分渠道无法通过 send 接收本地文件",
|
||||
"",
|
||||
"3. **必须发送链接**: 无论是网页还是文件,生成后**必须将完整的访问/下载链接直接写在回复文本中发送给用户**",
|
||||
"",
|
||||
|
||||
@@ -124,6 +124,10 @@ DOUBAO_SEED_2_PRO = "doubao-seed-2-0-pro-260215"
|
||||
DOUBAO_SEED_2_LITE = "doubao-seed-2-0-lite-260215"
|
||||
DOUBAO_SEED_2_MINI = "doubao-seed-2-0-mini-260215"
|
||||
|
||||
# ModelScope(魔搭社区)
|
||||
QWEN3_235B_A22B_INSTRUCT_2507 = "Qwen/Qwen3-235B-A22B-Instruct-2507"
|
||||
QWEN3_5_27B = "Qwen/Qwen3.5-27B"
|
||||
|
||||
# 其他模型
|
||||
WEN_XIN = "wenxin"
|
||||
WEN_XIN_4 = "wenxin-4"
|
||||
@@ -135,11 +139,14 @@ MODELSCOPE = "modelscope"
|
||||
|
||||
GITEE_AI_MODEL_LIST = ["Yi-34B-Chat", "InternVL2-8B", "deepseek-coder-33B-instruct", "InternVL2.5-26B", "Qwen2-VL-72B", "Qwen2.5-32B-Instruct", "glm-4-9b-chat", "codegeex4-all-9b", "Qwen2.5-Coder-32B-Instruct", "Qwen2.5-72B-Instruct", "Qwen2.5-7B-Instruct", "Qwen2-72B-Instruct", "Qwen2-7B-Instruct", "code-raccoon-v1", "Qwen2.5-14B-Instruct"]
|
||||
|
||||
MODELSCOPE_MODEL_LIST = ["LLM-Research/c4ai-command-r-plus-08-2024","mistralai/Mistral-Small-Instruct-2409","mistralai/Ministral-8B-Instruct-2410","mistralai/Mistral-Large-Instruct-2407",
|
||||
"Qwen/Qwen2.5-Coder-32B-Instruct","Qwen/Qwen2.5-Coder-14B-Instruct","Qwen/Qwen2.5-Coder-7B-Instruct","Qwen/Qwen2.5-72B-Instruct","Qwen/Qwen2.5-32B-Instruct","Qwen/Qwen2.5-14B-Instruct","Qwen/Qwen2.5-7B-Instruct","Qwen/QwQ-32B-Preview",
|
||||
"LLM-Research/Llama-3.3-70B-Instruct","opencompass/CompassJudger-1-32B-Instruct","Qwen/QVQ-72B-Preview","LLM-Research/Meta-Llama-3.1-405B-Instruct","LLM-Research/Meta-Llama-3.1-8B-Instruct","Qwen/Qwen2-VL-7B-Instruct","LLM-Research/Meta-Llama-3.1-70B-Instruct",
|
||||
"Qwen/Qwen2.5-14B-Instruct-1M","Qwen/Qwen2.5-7B-Instruct-1M","Qwen/Qwen2.5-VL-3B-Instruct","Qwen/Qwen2.5-VL-7B-Instruct","Qwen/Qwen2.5-VL-72B-Instruct","deepseek-ai/DeepSeek-R1-Distill-Llama-70B","deepseek-ai/DeepSeek-R1-Distill-Llama-8B","deepseek-ai/DeepSeek-R1-Distill-Qwen-32B",
|
||||
"deepseek-ai/DeepSeek-R1-Distill-Qwen-14B","deepseek-ai/DeepSeek-R1-Distill-Qwen-7B","deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B","deepseek-ai/DeepSeek-R1","deepseek-ai/DeepSeek-V3","Qwen/QwQ-32B"]
|
||||
MODELSCOPE_MODEL_LIST = ["deepseek-ai/DeepSeek-R1-0528", "deepseek-ai/DeepSeek-R1-Distill-Llama-70B", "deepseek-ai/DeepSeek-R1-Distill-Llama-8B", "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B", "deepseek-ai/DeepSeek-R1-Distill-Qwen-14B", "deepseek-ai/DeepSeek-R1-Distill-Qwen-32B",
|
||||
"deepseek-ai/DeepSeek-R1-Distill-Qwen-7B", "deepseek-ai/DeepSeek-V3.2", "LLM-Research/c4ai-command-r-plus-08-2024", "LLM-Research/Llama-4-Maverick-17B-128E-Instruct", "meituan-longcat/LongCat-Flash-Lite", "MiniMax/MiniMax-M1-80k", "MiniMax/MiniMax-M2.5", "mistralai/Ministral-8B-Instruct-2410",
|
||||
"mistralai/Mistral-Large-Instruct-2407", "mistralai/Mistral-Small-Instruct-2409", "moonshotai/Kimi-K2.5", "MusePublic/Qwen-Image-Edit", "opencompass/CompassJudger-1-32B-Instruct", "OpenGVLab/InternVL3_5-241B-A28B",
|
||||
"Qwen/QVQ-72B-Preview", "Qwen/Qwen-Image-Edit", "Qwen/Qwen3-0.6B", "Qwen/Qwen3-1.7B", "Qwen/Qwen3-14B", "Qwen/Qwen3-235B-A22B", "Qwen/Qwen3-235B-A22B-Instruct-2507", "Qwen/Qwen3-235B-A22B-Thinking-2507", "Qwen/Qwen3-30B-A3B", "Qwen/Qwen3-30B-A3B-Thinking-2507",
|
||||
"Qwen/Qwen3-32B", "Qwen/Qwen3-4B", "Qwen/Qwen3-8B", "Qwen/Qwen3-Coder-30B-A3B-Instruct", "Qwen/Qwen3-Coder-480B-A35B-Instruct", "Qwen/Qwen3-Next-80B-A3B-Instruct", "Qwen/Qwen3-Next-80B-A3B-Thinking", "Qwen/Qwen3-VL-235B-A22B-Instruct", "Qwen/Qwen3-VL-8B-Instruct",
|
||||
"Qwen/Qwen3-VL-8B-Thinking", "Qwen/Qwen3.5-122B-A10B", "Qwen/Qwen3.5-27B", "Qwen/Qwen3.5-35B-A3B", "Qwen/Qwen3.5-397B-A17B", "Qwen/QwQ-32B", "Qwen/QwQ-32B-Preview", "Shanghai_AI_Laboratory/Intern-S1", "Shanghai_AI_Laboratory/Intern-S1-mini",
|
||||
"stepfun-ai/Step-3.5-Flash", "XiaomiMiMo/MiMo-V2-Flash", "ZhipuAI/GLM-4.7-Flash", "ZhipuAI/GLM-5"]
|
||||
|
||||
|
||||
MODEL_LIST = [
|
||||
# Claude
|
||||
|
||||
@@ -1,5 +1,6 @@
|
||||
import logging
|
||||
import sys
|
||||
import io
|
||||
|
||||
|
||||
def _reset_logger(log):
|
||||
@@ -9,7 +10,10 @@ def _reset_logger(log):
|
||||
del handler
|
||||
log.handlers.clear()
|
||||
log.propagate = False
|
||||
console_handle = logging.StreamHandler(sys.stdout)
|
||||
stdout = sys.stdout
|
||||
if hasattr(stdout, "buffer"):
|
||||
stdout = io.TextIOWrapper(stdout.buffer, encoding="utf-8", errors="replace", line_buffering=True)
|
||||
console_handle = logging.StreamHandler(stdout)
|
||||
console_handle.setFormatter(
|
||||
logging.Formatter(
|
||||
"[%(levelname)s][%(asctime)s][%(filename)s:%(lineno)d] - %(message)s",
|
||||
|
||||
@@ -408,7 +408,7 @@ def get_root():
|
||||
|
||||
|
||||
def read_file(path):
|
||||
with open(path, mode="r", encoding="utf-8") as f:
|
||||
with open(path, mode="r", encoding="utf-8-sig") as f:
|
||||
return f.read()
|
||||
|
||||
|
||||
|
||||
@@ -4,29 +4,54 @@ LABEL maintainer="foo@bar.com"
|
||||
ARG TZ='Asia/Shanghai'
|
||||
|
||||
ARG CHATGPT_ON_WECHAT_VER
|
||||
# Set to "false" to skip Playwright/Chromium and produce a smaller image
|
||||
ARG INSTALL_BROWSER=true
|
||||
# Set to "true" to use China mirrors for apt / pip / playwright (faster in CN)
|
||||
ARG USE_CN_MIRROR=false
|
||||
|
||||
RUN echo /etc/apt/sources.list
|
||||
# RUN sed -i 's/deb.debian.org/mirrors.tuna.tsinghua.edu.cn/g' /etc/apt/sources.list
|
||||
ENV PLAYWRIGHT_BROWSERS_PATH=/app/ms-playwright
|
||||
ENV BUILD_PREFIX=/app
|
||||
|
||||
# Optionally switch apt and pip to China mirrors
|
||||
RUN if [ "$USE_CN_MIRROR" = "true" ]; then \
|
||||
sed -i 's/deb.debian.org/mirrors.tuna.tsinghua.edu.cn/g' /etc/apt/sources.list; \
|
||||
pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple/; \
|
||||
fi
|
||||
|
||||
ADD . ${BUILD_PREFIX}
|
||||
|
||||
# All heavy installs + user creation in ONE layer to avoid chown duplication
|
||||
RUN apt-get update \
|
||||
&&apt-get install -y --no-install-recommends bash ffmpeg espeak libavcodec-extra\
|
||||
&& apt-get install -y --no-install-recommends bash ffmpeg espeak libavcodec-extra \
|
||||
&& cd ${BUILD_PREFIX} \
|
||||
&& cp config-template.json config.json \
|
||||
&& /usr/local/bin/python -m pip install --no-cache --upgrade pip \
|
||||
&& pip install --no-cache -r requirements.txt \
|
||||
&& pip install --no-cache -r requirements-optional.txt
|
||||
&& pip install --no-cache -r requirements-optional.txt \
|
||||
&& pip install --no-cache -e . \
|
||||
&& if [ "$INSTALL_BROWSER" = "true" ]; then \
|
||||
apt-get install -y --no-install-recommends fonts-wqy-zenhei \
|
||||
&& pip install --no-cache "playwright==1.52.0" \
|
||||
&& python -m playwright install-deps chromium \
|
||||
&& mkdir -p /app/ms-playwright \
|
||||
&& if [ "$USE_CN_MIRROR" = "true" ]; then \
|
||||
PLAYWRIGHT_DOWNLOAD_HOST=https://registry.npmmirror.com/-/binary/playwright \
|
||||
python -m playwright install chromium; \
|
||||
else \
|
||||
python -m playwright install chromium; \
|
||||
fi; \
|
||||
fi \
|
||||
&& rm -rf /var/lib/apt/lists/* \
|
||||
&& mkdir -p /home/agent/cow \
|
||||
&& groupadd -r agent \
|
||||
&& useradd -r -g agent -s /bin/bash -d /home/agent agent \
|
||||
&& chown -R agent:agent /home/agent ${BUILD_PREFIX} /usr/local/lib
|
||||
|
||||
WORKDIR ${BUILD_PREFIX}
|
||||
|
||||
ADD docker/entrypoint.sh /entrypoint.sh
|
||||
|
||||
RUN chmod +x /entrypoint.sh \
|
||||
&& mkdir -p /home/agent/cow \
|
||||
&& groupadd -r agent \
|
||||
&& useradd -r -g agent -s /bin/bash -d /home/agent agent \
|
||||
&& chown -R agent:agent /home/agent ${BUILD_PREFIX} /usr/local/lib
|
||||
&& chown agent:agent /entrypoint.sh
|
||||
|
||||
ENTRYPOINT ["/entrypoint.sh"]
|
||||
|
||||
185
docs/agent.md
185
docs/agent.md
@@ -1,185 +0,0 @@
|
||||
# CowAgent介绍
|
||||
|
||||
## 概述
|
||||
|
||||
Cow项目从简单的聊天机器人全面升级为超级智能助理 **CowAgent**,能够主动规思考和规划任务、拥有长期记忆、操作计算机和外部资源、创造和执行Skill,真正理解你并和你一起成长。CowAgent能够长期运行在个人电脑或服务器中,通过飞书、钉钉、企业微信、网页等多种方式进行交互。核心能力如下:
|
||||
|
||||
- **复杂任务规划**:能够理解复杂任务并自主规划执行,持续思考和调用工具直到完成目标,支持多轮推理和上下文理解
|
||||
- **工具系统**:内置实现10+种工具,包括文件读写、bash终端、浏览器、定时任务、记忆管理等,通过Agent管理你的计算机或服务器
|
||||
- **长期记忆**:自动将对话记忆持久化至本地文件和数据库中,包括全局记忆和天级记忆,支持关键词及向量检索
|
||||
- **Skills系统**:新增Skill运行引擎,内置多种技能,并支持通过自然语言对话完成自定义Skills开发
|
||||
- **多渠道和多模型支持**:支持在Web、飞书、钉钉、企微等多渠道与Agent交互,支持Claude、Gemini、OpenAI、GLM、MiniMax、Qwen、Kimi、Doubao 等多种国内外主流模型
|
||||
- **安全和成本**:通过秘钥管理工具、提示词控制、系统权限等手段控制Agent的访问安全;通过最大记忆轮次、最大上下文token、工具执行步数对token成本进行限制
|
||||
|
||||
|
||||
## 核心功能
|
||||
|
||||
### 1. 长期记忆
|
||||
|
||||
> 记忆系统让 Agent 能够长期记住重要信息。Agent 会在用户分享偏好、决策、事实等重要信息时主动存储,也会在对话达到一定长度时自动提取摘要。记忆分为核心记忆、天级记忆,支持语义搜索和向量检索的混合检索模式。
|
||||
|
||||
|
||||
第一次启动Agent会主动向用户获取询问关键信息,并记录至工作空间 (默认为 ~/cow) 中的智能体设定、用户身份、记忆文件中。
|
||||
|
||||
在后续的长期对话中,Agent会在需要的时候智能记录或检索记忆,并对自身设定、用户偏好、记忆文件等进行不断更新,总结和记录经验和教训,真正实现自主思考和不断成长。
|
||||
|
||||
<img width="800" src="https://cdn.link-ai.tech/doc/20260203000455.png" />
|
||||
|
||||
|
||||
|
||||
### 2. 任务规划和工具调用
|
||||
|
||||
工具是Agent访问操作系统资源的核心,Agent会根据任务需求智能选择和调用工具,完成文件读写、命令执行、定时任务等各类操作。内置工具的视线在项目的 `tools` 目录下。
|
||||
|
||||
**主要工具:** 文件读写编辑、Bash终端、浏览器、文件发送、定时调度、记忆搜索、环境配置等。
|
||||
|
||||
#### 1.1 终端和文件访问能力
|
||||
|
||||
针对操作系统的终端和文件的访问能力,是最基础和核心的工具,其他很多工具或技能都是基于基础工具进行扩展。用户可通过手机端与Agent交互,操作个人电脑或服务器上的资源:
|
||||
|
||||
<img width="800" src="https://cdn.link-ai.tech/doc/20260202181130.png" />
|
||||
|
||||
#### 1.2 编程能力
|
||||
|
||||
基于编程能力和系统访问能力,Agent可以实现从信息搜索、图片等素材生成、编码、测试、部署、Nginx配置修改、发布的 Vibecoding 全流程,通过手机端简单的一句命令完成应用的快速demo:
|
||||
|
||||
|
||||
<img width="800" src="https://cdn.link-ai.tech/doc/20260203121008.png" />
|
||||
|
||||
|
||||
|
||||
#### 1.3 定时任务
|
||||
|
||||
基于 scheduler 工具实现动态定时任务,支持 **一次性任务、固定时间间隔、Cron表达式** 三种形式,任务触发可选择**固定消息发送** 或 **Agent动态任务** 执行两种模式,有很高灵活性:
|
||||
|
||||
|
||||
<img width="800" src="https://cdn.link-ai.tech/doc/20260202195402.png" />
|
||||
|
||||
同时你也可以通过自然语言快速查看和管理已有的定时任务。
|
||||
|
||||
|
||||
#### 1.4 环境变量管理
|
||||
|
||||
技能所需要的秘钥存储在环境变量文件中,由 `env_config` 工具进行管理,你可以通过对话的方式更新秘钥,工具内置了安全保护和脱敏策略,会严格保护秘钥安全:
|
||||
|
||||
<img width="800" src="https://cdn.link-ai.tech/doc/20260202234939.png" />
|
||||
|
||||
### 3. 技能系统
|
||||
|
||||
> 技能系统为Agent提供无限的扩展性,每个Skill由说明文件、运行脚本 (可选)、资源 (可选) 组成,描述如何完成特定类型的任务。通过Skill可以让Agent遵循说明完成复杂流程,调用各类工具或对接第三方系统等。
|
||||
|
||||
- **内置技能:** 在项目的`skills`目录下,包含技能创造器、网络搜索、图像识别(openai-image-vision)、LinkAI智能体、网页抓取等。内置Skill根据依赖条件 (API Key、系统命令等) 自动判断是否启用。通过技能创造器可以快速创建自定义技能。
|
||||
|
||||
- **自定义技能:** 由用户通过对话创建,存放在工作空间中 (`~/cow/skills/`),基于自定义技能可以实现任何复杂的业务流程和第三方系统对接。
|
||||
|
||||
|
||||
#### 3.1 创建技能
|
||||
|
||||
通过 `skill-creator` 技能可以通过对话的方式快速创建技能。你可以在与Agent的写作中让他对将某个工作流程固化为技能,或者把任意接口文档和示例发送给Agent,让他直接完成对接:
|
||||
|
||||
<img width="800" src="https://cdn.link-ai.tech/doc/20260202202247.png" />
|
||||
|
||||
|
||||
#### 3.2 搜索和图像识别
|
||||
|
||||
- **搜索技能:** 系统内置实现了 `bocha-search`(博查搜索)的Skill,依赖环境变量 `BOCHA_SEARCH_API_KEY`,可在[控制台](https://open.bochaai.com/)进行创建,并发送给Agent完成配置
|
||||
- **图像识别技能:** 实现了 `openai-image-vision` 插件,可使用 gpt-4.1-mini、gpt-4.1 等图像识别模型。依赖秘钥 `OPENAI_API_KEY`,可通过config.json或env_config工具进行维护。
|
||||
|
||||
<img width="800" src="https://cdn.link-ai.tech/doc/20260202213219.png" />
|
||||
|
||||
|
||||
#### 3.3 三方知识库和插件
|
||||
|
||||
`linkai-agent` 技能可以将 [LinkAI](https://link-ai.tech/) 上的所有智能体作为skill交给Agent使用,并实现多智能体决策的效果。
|
||||
|
||||
使用方式:需通过对话的方式配置 `LINKAI_API_KEY`,或在config.json中添加 `linkai_api_key`。 并在 `skills/linkai-agent/config.json`中添加智能体说明,示例如下:
|
||||
|
||||
```json
|
||||
{
|
||||
"apps": [
|
||||
{
|
||||
"app_code": "G7z6vKwp",
|
||||
"app_name": "LinkAI客服助手",
|
||||
"app_description": "当用户需要了解LinkAI平台相关问题时才选择该助手,基于LinkAI知识库进行回答"
|
||||
},
|
||||
{
|
||||
"app_code": "SFY5x7JR",
|
||||
"app_name": "内容创作助手",
|
||||
"app_description": "当用户需要创作图片或视频时才使用该助手,支持Nano Banana、Seedream、即梦、Veo、可灵等多种模型"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
Agent可根据智能体的名称和描述进行决策,并通过 app_code 调用接口访问对应的应用/工作流,通过该技能,可以灵活访问LinkAI平台上的智能体、知识库、插件等能力,实现效果如下:
|
||||
|
||||
<img width="750" src="https://cdn.link-ai.tech/doc/20260202234350.png" />
|
||||
|
||||
注:需通过 `env_config` 配置 `LINKAI_API_KEY`,或在config.json中添加 `linkai_api_key` 配置。
|
||||
|
||||
|
||||
## 使用方式
|
||||
|
||||
> 详细使用方式参考项目README.md文档进行
|
||||
|
||||
### 1.项目运行
|
||||
|
||||
在命令行中执行:
|
||||
|
||||
```bash
|
||||
bash <(curl -fsSL https://cdn.link-ai.tech/code/cow/run.sh)
|
||||
```
|
||||
|
||||
详细说明及后续程序管理参考:[项目启动脚本](https://github.com/zhayujie/chatgpt-on-wechat/wiki/CowAgentQuickStart)
|
||||
|
||||
|
||||
### 2.模型选择
|
||||
|
||||
Agent模式推荐使用以下模型,可根据效果及成本综合选择:
|
||||
|
||||
- **MiniMax**: `MiniMax-M2.7`
|
||||
- **GLM**: `glm-5-turbo`
|
||||
- **Kimi**: `kimi-k2.5`
|
||||
- **Doubao**: `doubao-seed-2-0-code-preview-260215`
|
||||
- **Qwen**: `qwen3.5-plus`
|
||||
- **Claude**: `claude-sonnet-4-6`
|
||||
- **Gemini**: `gemini-3.1-flash-lite-preview`
|
||||
- **OpenAI**: `gpt-5.4`
|
||||
|
||||
详细模型配置方式参考 [README.md 模型说明](../README.md#模型说明)
|
||||
|
||||
### 3.Agent核心配置
|
||||
|
||||
Agent模式的核心配置项如下,在 `config.json` 中配置:
|
||||
|
||||
```bash
|
||||
{
|
||||
"agent": true, # 是否启用Agent模式
|
||||
"agent_workspace": "~/cow", # Agent工作空间路径
|
||||
"agent_max_context_tokens": 40000, # 最大上下文tokens
|
||||
"agent_max_context_turns": 30, # 最大上下文记忆轮次
|
||||
"agent_max_steps": 15 # 单次任务最大决策步数
|
||||
}
|
||||
```
|
||||
|
||||
**配置说明:**
|
||||
|
||||
- `agent`: 设为 `true` 启用Agent模式,获得多轮工具决策、长期记忆、Skills等能力
|
||||
- `agent_workspace`: 工作空间路径,用于存储 memory、skills、其他系统设定提示词
|
||||
- `agent_max_context_tokens`: 上下文token上限,超出将自动丢弃最早的对话
|
||||
- `agent_max_context_turns`: 上下文记忆轮次,每轮包括一次提问和回复
|
||||
- `agent_max_steps`: 单次任务最大工具调用步数,防止无限循环
|
||||
|
||||
|
||||
### 4.渠道接入
|
||||
|
||||
Agent支持在多种渠道中使用,只需修改 `config.json` 中的 `channel_type` 配置即可切换。
|
||||
|
||||
- **Web网页**:默认使用该渠道,运行后监听本地端口,通过浏览器访问
|
||||
- **飞书接入**:[飞书接入文档](https://docs.link-ai.tech/cow/multi-platform/feishu)
|
||||
- **钉钉接入**:[钉钉接入文档](https://docs.link-ai.tech/cow/multi-platform/dingtalk)
|
||||
- **企业微信应用接入**:[企微应用文档](https://docs.link-ai.tech/cow/multi-platform/wechat-com)
|
||||
- **企微智能机器人**:[企微智能机器人文档](https://docs.link-ai.tech/cow/multi-platform/wecom-bot)
|
||||
- **QQ机器人**:[QQ机器人文档](https://docs.link-ai.tech/cow/multi-platform/qq)
|
||||
|
||||
更多渠道配置参考:[通道说明](../README.md#通道说明)
|
||||
@@ -9,7 +9,23 @@ description: 将 CowAgent 接入企业微信智能机器人(长连接模式)
|
||||
智能机器人与企业微信自建应用是两种不同的接入方式。智能机器人使用 WebSocket 长连接,无需服务器公网 IP 和域名,配置更简单。
|
||||
</Note>
|
||||
|
||||
## 一、创建智能机器人
|
||||
## 一、接入方式
|
||||
|
||||
### 方式一:扫码一键接入(推荐)
|
||||
|
||||
无需提前创建机器人,启动 Cow 项目后打开 Web 控制台(本地链接:http://127.0.0.1:9899/),选择 **通道** 菜单,点击**接入通道**,选择**企微智能机器人**,切换到「扫码接入」模式,使用**企业微信**扫码即可自动完成机器人创建和接入。
|
||||
|
||||
<img src="https://cdn.link-ai.tech/doc/20260401121213.png" width="800"/>
|
||||
|
||||
<Note>
|
||||
扫码成功后,可在企业微信工作台 - **智能机器人**页面对机器人进行进一步配置,包括修改名称、头像、可见范围等。
|
||||
</Note>
|
||||
|
||||
### 方式二:手动创建接入
|
||||
|
||||
需要先在企业微信中创建智能机器人并获取 Bot ID 和 Secret,再通过 Web 控制台或配置文件接入。
|
||||
|
||||
**步骤一:创建智能机器人**
|
||||
|
||||
1. 打开企业微信客户端,进入工作台,点击**智能机器人**:
|
||||
|
||||
@@ -25,34 +41,35 @@ description: 将 CowAgent 接入企业微信智能机器人(长连接模式)
|
||||
|
||||
4. 设置机器人名称、头像、可见范围,并选择**长连接模式**,记录下 **Bot ID** 和 **Secret** 信息后点击保存。
|
||||
|
||||
## 二、配置和运行
|
||||
**步骤二:接入 CowAgent**
|
||||
|
||||
### 方式一:Web 控制台接入
|
||||
<Tabs>
|
||||
<Tab title="Web 控制台">
|
||||
打开 Web 控制台,选择**通道**菜单,点击**接入通道**,选择**企微智能机器人**,切换到「手动填写」模式,输入 Bot ID 和 Secret,点击接入即可。
|
||||
|
||||
启动Cow项目后打开 Web 控制台 (本地链接为: http://127.0.0.1:9899/ ),选择 **通道** 菜单,点击 **接入通道**,选择 **企微智能机器人**,填写上一步保存的 Bot ID 和 Secret,点击接入即可。
|
||||
<img src="https://cdn.link-ai.tech/doc/20260316181711.png" width="800"/>
|
||||
</Tab>
|
||||
<Tab title="配置文件">
|
||||
在 `config.json` 中添加以下配置后启动程序:
|
||||
|
||||
<img src="https://cdn.link-ai.tech/doc/20260316181711.png" width="800"/>
|
||||
```json
|
||||
{
|
||||
"channel_type": "wecom_bot",
|
||||
"wecom_bot_id": "YOUR_BOT_ID",
|
||||
"wecom_bot_secret": "YOUR_SECRET"
|
||||
}
|
||||
```
|
||||
|
||||
### 方式二:配置文件接入
|
||||
| 参数 | 说明 |
|
||||
| --- | --- |
|
||||
| `wecom_bot_id` | 智能机器人的 BotID |
|
||||
| `wecom_bot_secret` | 智能机器人的 Secret |
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
在 `config.json` 中添加以下配置:
|
||||
日志显示 `[WecomBot] Subscribe success` 即表示连接成功。
|
||||
|
||||
```json
|
||||
{
|
||||
"channel_type": "wecom_bot",
|
||||
"wecom_bot_id": "YOUR_BOT_ID",
|
||||
"wecom_bot_secret": "YOUR_SECRET"
|
||||
}
|
||||
```
|
||||
|
||||
| 参数 | 说明 |
|
||||
| --- | --- |
|
||||
| `wecom_bot_id` | 智能机器人的 BotID |
|
||||
| `wecom_bot_secret` | 智能机器人的 Secret |
|
||||
|
||||
配置完成后启动程序,日志显示 `[WecomBot] Subscribe success` 即表示连接成功。
|
||||
|
||||
## 三、功能说明
|
||||
## 二、功能说明
|
||||
|
||||
| 功能 | 支持情况 |
|
||||
| --- | --- |
|
||||
@@ -64,7 +81,7 @@ description: 将 CowAgent 接入企业微信智能机器人(长连接模式)
|
||||
| 流式回复 | ✅ |
|
||||
| 定时任务主动推送 | ✅ |
|
||||
|
||||
## 四、使用
|
||||
## 三、使用
|
||||
|
||||
在企业微信中搜索创建的机器人名称,即可开始单聊对话。
|
||||
|
||||
|
||||
115
docs/commands/general.mdx
Normal file
115
docs/commands/general.mdx
Normal file
@@ -0,0 +1,115 @@
|
||||
---
|
||||
title: 常用命令
|
||||
description: 查看状态、管理配置和上下文等常用命令
|
||||
---
|
||||
|
||||
以下命令支持在对话中使用 `/` 前缀,也支持在终端中使用 `cow` 前缀(部分命令仅对话可用)。
|
||||
|
||||
<Tip>
|
||||
在 Web 控制台中输入 `/` 会自动弹出命令提示,支持键盘上下选择和 Tab 补全。
|
||||
</Tip>
|
||||
|
||||
## help
|
||||
|
||||
显示所有可用命令的帮助信息。
|
||||
|
||||
```text
|
||||
/help
|
||||
```
|
||||
|
||||
## status
|
||||
|
||||
查看当前会话和服务的运行状态,包括进程信息、模型配置、会话消息数量和已加载技能数量。
|
||||
|
||||
```text
|
||||
/status
|
||||
```
|
||||
|
||||
输出示例:
|
||||
|
||||
```
|
||||
🐮 CowAgent Status
|
||||
|
||||
Process: PID 12345 | Running 2h 15m
|
||||
Version: 2.0.4
|
||||
Channel: web
|
||||
Model: MiniMax-M2.5
|
||||
Mode: agent
|
||||
|
||||
Session: 12 messages | 8 skills loaded
|
||||
```
|
||||
|
||||
## config
|
||||
|
||||
查看或修改运行时配置。修改后立即生效,无需重启服务。
|
||||
|
||||
**查看所有可配置项:**
|
||||
|
||||
```text
|
||||
/config
|
||||
```
|
||||
|
||||
**查看单个配置项:**
|
||||
|
||||
```text
|
||||
/config model
|
||||
```
|
||||
|
||||
**修改配置项:**
|
||||
|
||||
```text
|
||||
/config model deepseek-chat
|
||||
```
|
||||
|
||||
**支持修改的配置项:**
|
||||
|
||||
| 配置项 | 说明 | 示例值 |
|
||||
| --- | --- | --- |
|
||||
| `model` | AI 模型名称 | `deepseek-chat` |
|
||||
| `agent_max_context_tokens` | 最大上下文 tokens | `40000` |
|
||||
| `agent_max_context_turns` | 最大上下文记忆轮次 | `30` |
|
||||
| `agent_max_steps` | 单次任务最大决策步数 | `15` |
|
||||
|
||||
<Note>
|
||||
修改 `model` 时,系统会自动匹配对应的模型调用方式。配置会写入 `config.json` 并持久保存。
|
||||
</Note>
|
||||
|
||||
## context
|
||||
|
||||
查看当前会话的上下文信息,包括消息数量、内容长度等统计。
|
||||
|
||||
```text
|
||||
/context
|
||||
```
|
||||
|
||||
**清空当前会话上下文:**
|
||||
|
||||
```text
|
||||
/context clear
|
||||
```
|
||||
|
||||
<Tip>
|
||||
清空上下文后,Agent 会"忘记"之前的对话内容,适用于切换话题或释放上下文空间。
|
||||
</Tip>
|
||||
|
||||
## logs
|
||||
|
||||
查看最近的服务日志,默认显示最近 20 行,最多 50 行。
|
||||
|
||||
```text
|
||||
/logs
|
||||
```
|
||||
|
||||
**指定行数:**
|
||||
|
||||
```text
|
||||
/logs 50
|
||||
```
|
||||
|
||||
## version
|
||||
|
||||
显示当前 CowAgent 版本号。
|
||||
|
||||
```text
|
||||
/version
|
||||
```
|
||||
86
docs/commands/index.mdx
Normal file
86
docs/commands/index.mdx
Normal file
@@ -0,0 +1,86 @@
|
||||
---
|
||||
title: 命令总览
|
||||
description: CowAgent 命令系统 — 终端 CLI 和对话命令
|
||||
---
|
||||
|
||||
CowAgent 提供两种命令交互方式:
|
||||
|
||||
- **终端CLI** — 在系统终端中执行 `cow <命令>`,用于服务管理、技能管理等运维操作
|
||||
- **对话命令** — 在对话中输入 `/<命令>` 或 `cow <命令>`,用于查看状态、管理技能、调整配置等
|
||||
|
||||
## 终端命令
|
||||
|
||||
通过一键安装脚本部署后,`cow` 命令会自动可用。手动安装的用户需要在项目根目录下额外执行:
|
||||
|
||||
```bash
|
||||
pip install -e .
|
||||
```
|
||||
|
||||
安装后即可在任意位置使用 `cow` 命令:
|
||||
|
||||
```bash
|
||||
cow help
|
||||
```
|
||||
|
||||
输出示例:
|
||||
|
||||
```
|
||||
CowAgent CLI
|
||||
|
||||
Usage: cow <command>
|
||||
|
||||
Service:
|
||||
start Start the CowAgent service
|
||||
stop Stop the CowAgent service
|
||||
restart Restart the CowAgent service
|
||||
update Update code and restart service
|
||||
status Show service status
|
||||
logs View service logs
|
||||
|
||||
Skills:
|
||||
skill Manage skills (list / search / install / uninstall ...)
|
||||
|
||||
Others:
|
||||
help Show this help message
|
||||
version Show version
|
||||
```
|
||||
|
||||
## 对话命令
|
||||
|
||||
在 Web 控制台或任意接入渠道的对话中,支持输入以 `/` 开头的命令:
|
||||
|
||||
| 命令 | 说明 |
|
||||
| --- | --- |
|
||||
| `/help` | 显示命令帮助 |
|
||||
| `/status` | 查看服务状态和配置 |
|
||||
| `/config` | 查看或修改运行时配置 |
|
||||
| `/skill` | 管理技能(安装、卸载、启用、禁用等) |
|
||||
| `/context` | 查看当前会话上下文信息 |
|
||||
| `/context clear` | 清空当前会话上下文 |
|
||||
| `/logs` | 查看最近日志 |
|
||||
| `/version` | 显示版本号 |
|
||||
|
||||
<Tip>
|
||||
对话命令中 `/start`、`/stop`、`/restart` 等服务管理命令会提示到终端中执行,因为它们涉及进程操作。
|
||||
</Tip>
|
||||
|
||||
## 命令对照表
|
||||
|
||||
以下是各命令在终端和对话中的可用性:
|
||||
|
||||
| 命令 | 终端 (`cow`) | 对话 (`/`) |
|
||||
| --- | :---: | :---: |
|
||||
| help | ✓ | ✓ |
|
||||
| version | ✓ | ✓ |
|
||||
| status | ✓ | ✓ |
|
||||
| logs | ✓ | ✓ |
|
||||
| config | ✗ | ✓ |
|
||||
| context | — | ✓ |
|
||||
| skill (子命令) | ✓ | ✓ |
|
||||
| start / stop / restart | ✓ | ✗ |
|
||||
| update | ✓ | ✗ |
|
||||
| install-browser | ✓ | ✗ |
|
||||
|
||||
<Note>
|
||||
`context` 在终端中仅提示到对话中使用。`config` 仅支持在对话中修改。
|
||||
</Note>
|
||||
134
docs/commands/process.mdx
Normal file
134
docs/commands/process.mdx
Normal file
@@ -0,0 +1,134 @@
|
||||
---
|
||||
title: 进程管理
|
||||
description: 使用 cow 命令管理 CowAgent 进程的启动、停止、重启、更新等操作
|
||||
---
|
||||
|
||||
进程管理命令用于控制 CowAgent 后台进程的生命周期。这些命令仅在终端中可用。
|
||||
|
||||
## start
|
||||
|
||||
启动 CowAgent 服务。默认以后台进程方式运行,并自动跟踪日志输出。
|
||||
|
||||
```bash
|
||||
cow start
|
||||
```
|
||||
|
||||
**选项:**
|
||||
|
||||
| 选项 | 说明 |
|
||||
| --- | --- |
|
||||
| `-f`, `--foreground` | 前台运行,不以后台守护进程方式启动 |
|
||||
| `--no-logs` | 启动后不自动跟踪日志 |
|
||||
|
||||
## stop
|
||||
|
||||
停止正在运行的 CowAgent 服务。
|
||||
|
||||
```bash
|
||||
cow stop
|
||||
```
|
||||
|
||||
## restart
|
||||
|
||||
重启 CowAgent 服务(先停止再启动)。
|
||||
|
||||
```bash
|
||||
cow restart
|
||||
```
|
||||
|
||||
**选项:**
|
||||
|
||||
| 选项 | 说明 |
|
||||
| --- | --- |
|
||||
| `--no-logs` | 重启后不自动跟踪日志 |
|
||||
|
||||
## update
|
||||
|
||||
更新代码并重启服务。自动执行以下流程:
|
||||
|
||||
1. 拉取最新代码(`git pull`)
|
||||
2. 停止当前服务
|
||||
3. 更新 Python 依赖
|
||||
4. 重新安装 CLI
|
||||
5. 启动服务
|
||||
|
||||
```bash
|
||||
cow update
|
||||
```
|
||||
|
||||
<Warning>
|
||||
如果 `git pull` 失败(如存在本地未提交的修改),更新会中止,服务不受影响。
|
||||
</Warning>
|
||||
|
||||
## status
|
||||
|
||||
查看 CowAgent 服务运行状态,包括进程信息、版本号、当前配置的模型和通道。
|
||||
|
||||
```bash
|
||||
cow status
|
||||
```
|
||||
|
||||
输出示例:
|
||||
|
||||
```
|
||||
🐮 CowAgent Status
|
||||
Status: ● Running (PID: 12345)
|
||||
Version: 2.0.4
|
||||
Channel: web
|
||||
Model: MiniMax-M2.5
|
||||
Mode: agent
|
||||
```
|
||||
|
||||
## logs
|
||||
|
||||
查看服务日志。
|
||||
|
||||
```bash
|
||||
cow logs
|
||||
```
|
||||
|
||||
**选项:**
|
||||
|
||||
| 选项 | 说明 | 默认值 |
|
||||
| --- | --- | --- |
|
||||
| `-f`, `--follow` | 持续跟踪日志输出 | 否 |
|
||||
| `-n`, `--lines` | 显示最近 N 行 | 50 |
|
||||
|
||||
示例:
|
||||
|
||||
```bash
|
||||
# 查看最近 100 行日志
|
||||
cow logs -n 100
|
||||
|
||||
# 持续跟踪日志
|
||||
cow logs -f
|
||||
```
|
||||
|
||||
## install-browser
|
||||
|
||||
安装 Playwright 和 Chromium 浏览器,用于启用 [浏览器工具](/tools/browser)。
|
||||
|
||||
```bash
|
||||
cow install-browser
|
||||
```
|
||||
|
||||
<Tip>
|
||||
仅在需要使用浏览器工具(如网页浏览、截图等)时才需要安装。
|
||||
</Tip>
|
||||
|
||||
## run.sh 兼容
|
||||
|
||||
如果未安装 Cow CLI,也可以使用 `run.sh` 脚本管理服务:
|
||||
|
||||
| cow 命令 | run.sh 等效命令 |
|
||||
| --- | --- |
|
||||
| `cow start` | `./run.sh start` |
|
||||
| `cow stop` | `./run.sh stop` |
|
||||
| `cow restart` | `./run.sh restart` |
|
||||
| `cow update` | `./run.sh update` |
|
||||
| `cow status` | `./run.sh status` |
|
||||
| `cow logs` | `./run.sh logs` |
|
||||
|
||||
<Note>
|
||||
推荐使用 `cow` 命令,它提供更简洁的语法和更丰富的功能。通过一键安装脚本部署时 `cow` 命令会自动安装。
|
||||
</Note>
|
||||
218
docs/commands/skill.mdx
Normal file
218
docs/commands/skill.mdx
Normal file
@@ -0,0 +1,218 @@
|
||||
---
|
||||
title: 技能管理
|
||||
description: 通过命令安装、卸载、启用、禁用和管理技能
|
||||
---
|
||||
|
||||
技能管理命令用于安装、查询和管理 CowAgent 的技能。在对话中使用 `/skill <子命令>`,在终端中使用 `cow skill <子命令>`。
|
||||
|
||||
## list
|
||||
|
||||
列出已安装的技能及其状态。
|
||||
|
||||
<CodeGroup>
|
||||
```text 对话
|
||||
/skill list
|
||||
```
|
||||
|
||||
```bash 终端
|
||||
cow skill list
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
输出示例:
|
||||
|
||||
```
|
||||
📦 已安装的技能 (3/4)
|
||||
|
||||
✅ pptx
|
||||
Use this skill any time a .pptx file is involved…
|
||||
来源: cowhub
|
||||
|
||||
✅ skill-creator
|
||||
Create, install, or update skills…
|
||||
来源: builtin
|
||||
|
||||
⏸️ image-vision (已禁用)
|
||||
图片理解和视觉分析
|
||||
来源: builtin
|
||||
```
|
||||
|
||||
**浏览技能广场**(查看 Hub 上所有可安装的技能):
|
||||
|
||||
<CodeGroup>
|
||||
```text 对话
|
||||
/skill list --remote
|
||||
```
|
||||
|
||||
```bash 终端
|
||||
cow skill list --remote
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
**选项:**
|
||||
|
||||
| 选项 | 说明 | 默认值 |
|
||||
| --- | --- | --- |
|
||||
| `--remote`, `-r` | 浏览 Skill Hub 远程技能列表 | 否 |
|
||||
| `--page` | 远程列表分页页码 | 1 |
|
||||
|
||||
## search
|
||||
|
||||
在技能广场中搜索技能。
|
||||
|
||||
<CodeGroup>
|
||||
```text 对话
|
||||
/skill search pptx
|
||||
```
|
||||
|
||||
```bash 终端
|
||||
cow skill search pptx
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
## install
|
||||
|
||||
安装技能。通过统一的 `install` 命令,可一键安装来自 **Cow 技能广场、GitHub、ClawHub** 以及任意 URL(zip 压缩包、SKILL.md 链接)上的技能,无需手动下载和配置。
|
||||
|
||||
**从 Cow 技能广场安装(推荐):**
|
||||
|
||||
<CodeGroup>
|
||||
```text 对话
|
||||
/skill install pptx
|
||||
```
|
||||
|
||||
```bash 终端
|
||||
cow skill install pptx
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
**从 GitHub 安装:**
|
||||
|
||||
<CodeGroup>
|
||||
```text 对话
|
||||
# 安装仓库中的所有技能(自动扫描包含 SKILL.md 的子目录)
|
||||
/skill install larksuite/cli
|
||||
|
||||
# 指定子目录,只安装单个技能
|
||||
/skill install https://github.com/larksuite/cli/tree/main/skills/lark-im
|
||||
|
||||
# 使用 # 指定子目录
|
||||
/skill install larksuite/cli#skills/lark-minutes
|
||||
```
|
||||
|
||||
```bash 终端
|
||||
# 安装仓库中的所有技能(自动扫描包含 SKILL.md 的子目录)
|
||||
cow skill install larksuite/cli
|
||||
|
||||
# 指定子目录,只安装单个技能
|
||||
cow skill install https://github.com/larksuite/cli/tree/main/skills/lark-im
|
||||
|
||||
# 使用 # 指定子目录
|
||||
cow skill install larksuite/cli#skills/lark-minutes
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
支持完整的 GitHub URL 和 `owner/repo` 简写。对于 mono-repo(一个仓库中包含多个技能),不指定子目录时会自动发现并批量安装所有技能;指定子目录时只安装该目录下的技能。
|
||||
|
||||
**从 ClawHub 安装:**
|
||||
|
||||
<CodeGroup>
|
||||
```text 对话
|
||||
/skill install clawhub:baidu-search
|
||||
```
|
||||
|
||||
```bash 终端
|
||||
cow skill install clawhub:baidu-search
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
**从 URL 安装:**
|
||||
|
||||
<CodeGroup>
|
||||
```text 对话
|
||||
# 从 zip 压缩包安装(支持单个或批量)
|
||||
/skill install https://cdn.link-ai.tech/skills/pptx.zip
|
||||
|
||||
# 从 SKILL.md 链接安装
|
||||
/skill install https://example.com/path/to/SKILL.md
|
||||
```
|
||||
|
||||
```bash 终端
|
||||
# 从 zip 压缩包安装(支持单个或批量)
|
||||
cow skill install https://cdn.link-ai.tech/skills/pptx.zip
|
||||
|
||||
# 从 SKILL.md 链接安装
|
||||
cow skill install https://example.com/path/to/SKILL.md
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
支持从 zip / tar.gz 压缩包 URL 安装,解压后自动扫描包含 `SKILL.md` 的目录,支持单个或批量安装。也支持直接从 `SKILL.md` 文件链接安装,会自动解析技能名称和描述。
|
||||
|
||||
安装成功后会显示技能名称、描述和来源,例如:
|
||||
|
||||
```
|
||||
✅ baidu-search
|
||||
百度搜索:使用百度搜索引擎检索信息…
|
||||
来源: clawhub
|
||||
```
|
||||
|
||||
## uninstall
|
||||
|
||||
卸载已安装的技能。
|
||||
|
||||
<CodeGroup>
|
||||
```text 对话
|
||||
/skill uninstall pptx
|
||||
```
|
||||
|
||||
```bash 终端
|
||||
cow skill uninstall pptx
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
<Warning>
|
||||
卸载操作会删除技能目录下的所有文件,此操作不可恢复。
|
||||
</Warning>
|
||||
|
||||
## enable / disable
|
||||
|
||||
启用或禁用技能,禁用后技能不会被 Agent 调用。
|
||||
|
||||
<CodeGroup>
|
||||
```text 对话
|
||||
/skill enable pptx
|
||||
/skill disable pptx
|
||||
```
|
||||
|
||||
```bash 终端
|
||||
cow skill enable pptx
|
||||
cow skill disable pptx
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
## info
|
||||
|
||||
查看已安装技能的详细信息,包括 `SKILL.md` 内容预览。
|
||||
|
||||
<CodeGroup>
|
||||
```text 对话
|
||||
/skill info pptx
|
||||
```
|
||||
|
||||
```bash 终端
|
||||
cow skill info pptx
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
## 技能来源
|
||||
|
||||
安装的技能会记录来源信息,可通过 `/skill list` 查看:
|
||||
|
||||
| 来源标识 | 说明 |
|
||||
| --- | --- |
|
||||
| `builtin` | 项目内置技能 |
|
||||
| `cowhub` | 从 CowAgent Skill Hub 安装 |
|
||||
| `github` | 从 GitHub URL 直接安装 |
|
||||
| `clawhub` | 从 ClawHub 安装 |
|
||||
| `url` | 从 SKILL.md URL 安装 |
|
||||
| `local` | 本地创建的技能 |
|
||||
110
docs/docs.json
110
docs/docs.json
@@ -106,14 +106,17 @@
|
||||
"tools/bash",
|
||||
"tools/send",
|
||||
"tools/memory",
|
||||
"tools/env-config"
|
||||
"tools/env-config",
|
||||
"tools/web-fetch",
|
||||
"tools/scheduler"
|
||||
]
|
||||
},
|
||||
{
|
||||
"group": "可选工具",
|
||||
"pages": [
|
||||
"tools/web-search",
|
||||
"tools/scheduler"
|
||||
"tools/vision",
|
||||
"tools/browser"
|
||||
]
|
||||
}
|
||||
]
|
||||
@@ -125,15 +128,9 @@
|
||||
"group": "技能系统",
|
||||
"pages": [
|
||||
"skills/index",
|
||||
"skills/skill-creator"
|
||||
]
|
||||
},
|
||||
{
|
||||
"group": "内置技能",
|
||||
"pages": [
|
||||
"skills/image-vision",
|
||||
"skills/linkai-agent",
|
||||
"skills/web-fetch"
|
||||
"skills/install",
|
||||
"skills/create",
|
||||
"skills/hub"
|
||||
]
|
||||
}
|
||||
]
|
||||
@@ -144,7 +141,8 @@
|
||||
{
|
||||
"group": "记忆系统",
|
||||
"pages": [
|
||||
"memory"
|
||||
"memory/index",
|
||||
"memory/context"
|
||||
]
|
||||
}
|
||||
]
|
||||
@@ -167,6 +165,20 @@
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"tab": "命令",
|
||||
"groups": [
|
||||
{
|
||||
"group": "命令系统",
|
||||
"pages": [
|
||||
"commands/index",
|
||||
"commands/process",
|
||||
"commands/skill",
|
||||
"commands/general"
|
||||
]
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"tab": "版本",
|
||||
"groups": [
|
||||
@@ -174,6 +186,7 @@
|
||||
"group": "发布记录",
|
||||
"pages": [
|
||||
"releases/overview",
|
||||
"releases/v2.0.5",
|
||||
"releases/v2.0.4",
|
||||
"releases/v2.0.3",
|
||||
"releases/v2.0.2",
|
||||
@@ -254,14 +267,17 @@
|
||||
"en/tools/bash",
|
||||
"en/tools/send",
|
||||
"en/tools/memory",
|
||||
"en/tools/env-config"
|
||||
"en/tools/env-config",
|
||||
"en/tools/web-fetch",
|
||||
"en/tools/scheduler"
|
||||
]
|
||||
},
|
||||
{
|
||||
"group": "Optional Tools",
|
||||
"pages": [
|
||||
"en/tools/web-search",
|
||||
"en/tools/scheduler"
|
||||
"en/tools/vision",
|
||||
"en/tools/browser"
|
||||
]
|
||||
}
|
||||
]
|
||||
@@ -273,15 +289,9 @@
|
||||
"group": "Skills System",
|
||||
"pages": [
|
||||
"en/skills/index",
|
||||
"en/skills/skill-creator"
|
||||
]
|
||||
},
|
||||
{
|
||||
"group": "Built-in Skills",
|
||||
"pages": [
|
||||
"en/skills/image-vision",
|
||||
"en/skills/linkai-agent",
|
||||
"en/skills/web-fetch"
|
||||
"en/skills/install",
|
||||
"en/skills/skill-creator",
|
||||
"en/skills/hub"
|
||||
]
|
||||
}
|
||||
]
|
||||
@@ -292,7 +302,8 @@
|
||||
{
|
||||
"group": "Memory System",
|
||||
"pages": [
|
||||
"en/memory"
|
||||
"en/memory/index",
|
||||
"en/memory/context"
|
||||
]
|
||||
}
|
||||
]
|
||||
@@ -315,6 +326,20 @@
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"tab": "Commands",
|
||||
"groups": [
|
||||
{
|
||||
"group": "Command System",
|
||||
"pages": [
|
||||
"en/commands/index",
|
||||
"en/commands/process",
|
||||
"en/commands/skill",
|
||||
"en/commands/chat"
|
||||
]
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"tab": "Releases",
|
||||
"groups": [
|
||||
@@ -322,6 +347,7 @@
|
||||
"group": "Release Notes",
|
||||
"pages": [
|
||||
"en/releases/overview",
|
||||
"en/releases/v2.0.5",
|
||||
"en/releases/v2.0.4",
|
||||
"en/releases/v2.0.2",
|
||||
"en/releases/v2.0.1",
|
||||
@@ -403,14 +429,16 @@
|
||||
"ja/tools/send",
|
||||
"ja/tools/memory",
|
||||
"ja/tools/env-config",
|
||||
"ja/tools/browser"
|
||||
"ja/tools/web-fetch",
|
||||
"ja/tools/scheduler"
|
||||
]
|
||||
},
|
||||
{
|
||||
"group": "オプションツール",
|
||||
"pages": [
|
||||
"ja/tools/web-search",
|
||||
"ja/tools/scheduler"
|
||||
"ja/tools/vision",
|
||||
"ja/tools/browser"
|
||||
]
|
||||
}
|
||||
]
|
||||
@@ -422,15 +450,9 @@
|
||||
"group": "スキルシステム",
|
||||
"pages": [
|
||||
"ja/skills/index",
|
||||
"ja/skills/skill-creator"
|
||||
]
|
||||
},
|
||||
{
|
||||
"group": "内蔵スキル",
|
||||
"pages": [
|
||||
"ja/skills/image-vision",
|
||||
"ja/skills/linkai-agent",
|
||||
"ja/skills/web-fetch"
|
||||
"ja/skills/install",
|
||||
"ja/skills/create",
|
||||
"ja/skills/hub"
|
||||
]
|
||||
}
|
||||
]
|
||||
@@ -441,7 +463,8 @@
|
||||
{
|
||||
"group": "メモリシステム",
|
||||
"pages": [
|
||||
"ja/memory"
|
||||
"ja/memory/index",
|
||||
"ja/memory/context"
|
||||
]
|
||||
}
|
||||
]
|
||||
@@ -464,6 +487,20 @@
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"tab": "コマンド",
|
||||
"groups": [
|
||||
{
|
||||
"group": "コマンドシステム",
|
||||
"pages": [
|
||||
"ja/commands/index",
|
||||
"ja/commands/process",
|
||||
"ja/commands/skill",
|
||||
"ja/commands/general"
|
||||
]
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"tab": "リリース",
|
||||
"groups": [
|
||||
@@ -471,6 +508,7 @@
|
||||
"group": "リリースノート",
|
||||
"pages": [
|
||||
"ja/releases/overview",
|
||||
"ja/releases/v2.0.5",
|
||||
"ja/releases/v2.0.4",
|
||||
"ja/releases/v2.0.3",
|
||||
"ja/releases/v2.0.2",
|
||||
|
||||
@@ -13,6 +13,7 @@
|
||||
<a href="https://cowagent.ai/">🌐 Website</a> ·
|
||||
<a href="https://docs.cowagent.ai/en/intro/index">📖 Docs</a> ·
|
||||
<a href="https://docs.cowagent.ai/en/guide/quick-start">🚀 Quick Start</a> ·
|
||||
<a href="https://skills.cowagent.ai/">🧩 Skill Hub</a> ·
|
||||
<a href="https://link-ai.tech/cowagent/create">☁️ Try Online</a>
|
||||
</p>
|
||||
|
||||
@@ -20,13 +21,14 @@
|
||||
|
||||
> CowAgent is both an out-of-the-box AI super assistant and a highly extensible Agent framework. You can extend it with new model interfaces, channels, built-in tools, and the Skills system to flexibly implement various customization needs.
|
||||
|
||||
- ✅ **Autonomous Task Planning**: Understands complex tasks and autonomously plans execution, continuously thinking and invoking tools until goals are achieved. Supports accessing files, terminal, browser, schedulers, and other system resources via tools.
|
||||
- ✅ **Autonomous Task Planning**: Understands complex tasks and autonomously plans execution, continuously thinking and invoking tools until goals are achieved.
|
||||
- ✅ **Long-term Memory**: Automatically persists conversation memory to local files and databases, including core memory and daily memory, with keyword and vector retrieval support.
|
||||
- ✅ **Skills System**: Implements a Skills creation and execution engine with multiple built-in skills, and supports custom Skills development through natural language conversation.
|
||||
- ✅ **Skills System**: Implements a Skills creation and execution engine, supports installing skills from [Skill Hub](https://skills.cowagent.ai), GitHub, etc., or creating custom Skills through conversation.
|
||||
- ✅ **Tool System**: Built-in tools for file I/O, terminal execution, browser automation, scheduled tasks, messaging, and more — autonomously invoked by the Agent.
|
||||
- ✅ **CLI System**: Provides terminal commands and in-chat commands for process management, skill installation, configuration, and more.
|
||||
- ✅ **Multimodal Messages**: Supports parsing, processing, generating, and sending text, images, voice, files, and other message types.
|
||||
- ✅ **Multiple Model Support**: Supports OpenAI, Claude, Gemini, DeepSeek, MiniMax, GLM, Qwen, Kimi, Doubao, and other mainstream model providers.
|
||||
- ✅ **Multi-platform Deployment**: Runs on local computers or servers, integrable into WeChat, Web, Feishu, DingTalk, WeChat Official Account, and WeCom applications.
|
||||
- ✅ **Knowledge Base**: Integrates enterprise knowledge base capabilities via the [LinkAI](https://link-ai.tech) platform.
|
||||
|
||||
## Disclaimer
|
||||
|
||||
@@ -40,6 +42,8 @@ Try online (no deployment needed): [CowAgent](https://link-ai.tech/cowagent/crea
|
||||
|
||||
## Changelog
|
||||
|
||||
> **2026.04.01:** [v2.0.5](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/2.0.5) — Cow CLI, Skill Hub open source, Browser tool, WeCom Bot QR scan, and more.
|
||||
|
||||
> **2026.02.27:** [v2.0.2](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/2.0.2) — Web console overhaul (streaming chat, model/skill/memory/channel/scheduler/log management), multi-channel concurrent running, session persistence, new models including Gemini 3.1 Pro / Claude 4.6 Sonnet / Qwen3.5 Plus.
|
||||
|
||||
> **2026.02.13:** [v2.0.1](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/2.0.1) — Built-in Web Search tool, smart context trimming, runtime info dynamic update, Windows compatibility, fixes for scheduler memory loss, Feishu connection issues, and more.
|
||||
@@ -60,13 +64,19 @@ Full changelog: [Release Notes](https://docs.cowagent.ai/en/releases/overview)
|
||||
|
||||
The project provides a one-click script for installation, configuration, startup, and management:
|
||||
|
||||
**Linux / macOS:**
|
||||
```bash
|
||||
bash <(curl -fsSL https://cdn.link-ai.tech/code/cow/run.sh)
|
||||
```
|
||||
|
||||
**Windows (PowerShell):**
|
||||
```powershell
|
||||
irm https://cdn.link-ai.tech/code/cow/run.ps1 | iex
|
||||
```
|
||||
|
||||
After running, the Web service starts by default. Access `http://localhost:9899/chat` to chat.
|
||||
|
||||
Script usage: [One-click Install](https://docs.cowagent.ai/en/guide/quick-start)
|
||||
Script usage: [One-click Install](https://docs.cowagent.ai/en/guide/quick-start). After installation, you can also use `cow start`, `cow stop`, and other [CLI commands](https://docs.cowagent.ai/en/commands/index) to manage the service.
|
||||
|
||||
### Manual Installation
|
||||
|
||||
@@ -84,7 +94,25 @@ pip3 install -r requirements.txt
|
||||
pip3 install -r requirements-optional.txt # optional but recommended
|
||||
```
|
||||
|
||||
**3. Configure**
|
||||
**3. Install Cow CLI (recommended)**
|
||||
|
||||
```bash
|
||||
pip3 install -e .
|
||||
```
|
||||
|
||||
After installation, use `cow` commands to manage the service (start, stop, update, etc.) and skills. See [Command Docs](https://docs.cowagent.ai/en/commands/index).
|
||||
|
||||
**4. Install browser (optional)**
|
||||
|
||||
If you need the Agent to operate a browser (visit web pages, fill forms, etc.):
|
||||
|
||||
```bash
|
||||
cow install-browser
|
||||
```
|
||||
|
||||
This auto-installs `playwright` and Chromium. See [Browser Tool Docs](https://docs.cowagent.ai/en/tools/browser).
|
||||
|
||||
**5. Configure**
|
||||
|
||||
```bash
|
||||
cp config-template.json config.json
|
||||
@@ -92,13 +120,25 @@ cp config-template.json config.json
|
||||
|
||||
Fill in your model API key and channel type in `config.json`. See the [configuration docs](https://docs.cowagent.ai/en/guide/manual-install) for details.
|
||||
|
||||
**4. Run**
|
||||
**6. Run**
|
||||
|
||||
```bash
|
||||
python3 app.py
|
||||
cow start # recommended, requires Cow CLI
|
||||
python3 app.py # or run directly
|
||||
```
|
||||
|
||||
For server background run:
|
||||
For server deployment, use `cow` commands to manage the service:
|
||||
|
||||
```bash
|
||||
cow start # start in background
|
||||
cow stop # stop service
|
||||
cow restart # restart service
|
||||
cow status # check running status
|
||||
cow logs # view logs
|
||||
cow update # pull latest code and restart
|
||||
```
|
||||
|
||||
Or use the traditional way:
|
||||
|
||||
```bash
|
||||
nohup python3 app.py & tail -f nohup.out
|
||||
@@ -186,6 +226,7 @@ Multiple channels can be enabled simultaneously, separated by commas: `"channel_
|
||||
|
||||
## 🔗 Related Projects
|
||||
|
||||
- [Cow Skill Hub](https://github.com/zhayujie/cow-skill-hub): Open skill marketplace for AI Agents — browse, search, install, and publish skills for CowAgent, OpenClaw, Claude Code, and more.
|
||||
- [bot-on-anything](https://github.com/zhayujie/bot-on-anything): Lightweight and highly extensible LLM application framework supporting Slack, Telegram, Discord, Gmail, and more.
|
||||
- [AgentMesh](https://github.com/MinimalFuture/AgentMesh): Open-source Multi-Agent framework for complex problem solving through agent team collaboration.
|
||||
|
||||
@@ -195,7 +236,7 @@ FAQs: <https://github.com/zhayujie/chatgpt-on-wechat/wiki/FAQs>
|
||||
|
||||
## 🛠️ Contributing
|
||||
|
||||
Welcome to add new channels, referring to the [Feishu channel](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/channel/feishu/feishu_channel.py) as an example. Also welcome to contribute new Skills, referring to the [Skill Creator docs](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/skills/skill-creator/SKILL.md).
|
||||
Welcome to add new channels, referring to the [Feishu channel](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/channel/feishu/feishu_channel.py) as an example. Also welcome to contribute new Skills, see the [Skill Creation docs](https://docs.cowagent.ai/en/skills/create), or submit to [Skill Hub](https://skills.cowagent.ai/submit).
|
||||
|
||||
## ✉ Contact
|
||||
|
||||
|
||||
101
docs/en/commands/general.mdx
Normal file
101
docs/en/commands/general.mdx
Normal file
@@ -0,0 +1,101 @@
|
||||
---
|
||||
title: General Commands
|
||||
description: View status, manage config, and control context with commonly used commands
|
||||
---
|
||||
|
||||
The following commands can be used in chat with the `/` prefix or in the terminal with the `cow` prefix (some are chat-only).
|
||||
|
||||
<Tip>
|
||||
In the Web console, typing `/` brings up an autocomplete menu with keyboard navigation and Tab completion.
|
||||
</Tip>
|
||||
|
||||
## help
|
||||
|
||||
Show help information for all available commands.
|
||||
|
||||
```text
|
||||
/help
|
||||
```
|
||||
|
||||
## status
|
||||
|
||||
View current session and service status, including process info, model configuration, message count, and loaded skills.
|
||||
|
||||
```text
|
||||
/status
|
||||
```
|
||||
|
||||
## config
|
||||
|
||||
View or modify runtime configuration. Changes take effect immediately without restarting.
|
||||
|
||||
**View all configurable items:**
|
||||
|
||||
```text
|
||||
/config
|
||||
```
|
||||
|
||||
**View a single item:**
|
||||
|
||||
```text
|
||||
/config model
|
||||
```
|
||||
|
||||
**Modify a config item:**
|
||||
|
||||
```text
|
||||
/config model deepseek-chat
|
||||
```
|
||||
|
||||
**Configurable items:**
|
||||
|
||||
| Item | Description | Example |
|
||||
| --- | --- | --- |
|
||||
| `model` | AI model name | `deepseek-chat` |
|
||||
| `agent_max_context_tokens` | Max context tokens | `40000` |
|
||||
| `agent_max_context_turns` | Max context memory turns | `30` |
|
||||
| `agent_max_steps` | Max decision steps per task | `15` |
|
||||
|
||||
<Note>
|
||||
When changing `model`, the system automatically matches the corresponding model API. Configuration is persisted to `config.json`.
|
||||
</Note>
|
||||
|
||||
## context
|
||||
|
||||
View current session context statistics, including message count and content length.
|
||||
|
||||
```text
|
||||
/context
|
||||
```
|
||||
|
||||
**Clear current session context:**
|
||||
|
||||
```text
|
||||
/context clear
|
||||
```
|
||||
|
||||
<Tip>
|
||||
Clearing context makes the Agent "forget" previous conversation, useful for switching topics or freeing context space.
|
||||
</Tip>
|
||||
|
||||
## logs
|
||||
|
||||
View recent service logs. Shows the last 20 lines by default, up to 50.
|
||||
|
||||
```text
|
||||
/logs
|
||||
```
|
||||
|
||||
**Specify line count:**
|
||||
|
||||
```text
|
||||
/logs 50
|
||||
```
|
||||
|
||||
## version
|
||||
|
||||
Show the current CowAgent version.
|
||||
|
||||
```text
|
||||
/version
|
||||
```
|
||||
84
docs/en/commands/index.mdx
Normal file
84
docs/en/commands/index.mdx
Normal file
@@ -0,0 +1,84 @@
|
||||
---
|
||||
title: Commands Overview
|
||||
description: CowAgent command system — Terminal CLI and chat commands
|
||||
---
|
||||
|
||||
CowAgent provides two ways to interact via commands:
|
||||
|
||||
- **Terminal CLI** — Run `cow <command>` in your system terminal for service management, skill management, and other operations
|
||||
- **Chat Commands** — Type `/<command>` or `cow <command>` in any conversation to check status, manage skills, adjust configuration, etc.
|
||||
|
||||
## Cow CLI
|
||||
|
||||
After deploying with the one-click install script, the `cow` command is automatically available. For manual installations, run:
|
||||
|
||||
```bash
|
||||
pip install -e .
|
||||
```
|
||||
|
||||
Then use the `cow` command from anywhere:
|
||||
|
||||
```bash
|
||||
cow help
|
||||
```
|
||||
|
||||
Example output:
|
||||
|
||||
```
|
||||
🐮 CowAgent CLI
|
||||
|
||||
Usage: cow <command>
|
||||
|
||||
Service:
|
||||
start Start the CowAgent service
|
||||
stop Stop the CowAgent service
|
||||
restart Restart the CowAgent service
|
||||
update Update code and restart service
|
||||
status Show service status
|
||||
logs View service logs
|
||||
|
||||
Skills:
|
||||
skill Manage skills (list / search / install / uninstall ...)
|
||||
|
||||
Others:
|
||||
help Show this help message
|
||||
version Show version
|
||||
```
|
||||
|
||||
## Chat Commands
|
||||
|
||||
In the Web console or any connected channel, type `/` to see command suggestions. Supported commands:
|
||||
|
||||
| Command | Description |
|
||||
| --- | --- |
|
||||
| `/help` | Show command help |
|
||||
| `/status` | View service status and configuration |
|
||||
| `/config` | View or modify runtime configuration |
|
||||
| `/skill` | Manage skills (install, uninstall, enable, disable, etc.) |
|
||||
| `/context` | View current session context info |
|
||||
| `/context clear` | Clear current session context |
|
||||
| `/logs` | View recent logs |
|
||||
| `/version` | Show version number |
|
||||
|
||||
<Tip>
|
||||
Service management commands like `/start`, `/stop`, `/restart` will prompt you to use them in the terminal instead, as they involve process operations.
|
||||
</Tip>
|
||||
|
||||
## Command Availability
|
||||
|
||||
| Command | Terminal (`cow`) | Chat (`/`) |
|
||||
| --- | :---: | :---: |
|
||||
| help | ✓ | ✓ |
|
||||
| version | ✓ | ✓ |
|
||||
| status | ✓ | ✓ |
|
||||
| logs | ✓ | ✓ |
|
||||
| config | ✗ | ✓ |
|
||||
| context | — | ✓ |
|
||||
| skill (subcommands) | ✓ | ✓ |
|
||||
| start / stop / restart | ✓ | ✗ |
|
||||
| update | ✓ | ✗ |
|
||||
| install-browser | ✓ | ✗ |
|
||||
|
||||
<Note>
|
||||
`context` only shows a hint in the terminal to use it in chat. `config` is only available in chat.
|
||||
</Note>
|
||||
123
docs/en/commands/process.mdx
Normal file
123
docs/en/commands/process.mdx
Normal file
@@ -0,0 +1,123 @@
|
||||
---
|
||||
title: Process Management
|
||||
description: Manage CowAgent process lifecycle with cow commands
|
||||
---
|
||||
|
||||
Process management commands control the CowAgent background process. These commands are only available in the terminal.
|
||||
|
||||
## start
|
||||
|
||||
Start the CowAgent service. Runs as a background daemon by default and automatically tails logs.
|
||||
|
||||
```bash
|
||||
cow start
|
||||
```
|
||||
|
||||
**Options:**
|
||||
|
||||
| Option | Description |
|
||||
| --- | --- |
|
||||
| `-f`, `--foreground` | Run in foreground, not as a background daemon |
|
||||
| `--no-logs` | Don't tail logs after starting |
|
||||
|
||||
## stop
|
||||
|
||||
Stop the running CowAgent service.
|
||||
|
||||
```bash
|
||||
cow stop
|
||||
```
|
||||
|
||||
## restart
|
||||
|
||||
Restart the CowAgent service (stop then start).
|
||||
|
||||
```bash
|
||||
cow restart
|
||||
```
|
||||
|
||||
**Options:**
|
||||
|
||||
| Option | Description |
|
||||
| --- | --- |
|
||||
| `--no-logs` | Don't tail logs after restart |
|
||||
|
||||
## update
|
||||
|
||||
Update code and restart the service. Automatically performs:
|
||||
|
||||
1. Pull latest code (`git pull`)
|
||||
2. Stop current service
|
||||
3. Update Python dependencies
|
||||
4. Reinstall CLI
|
||||
5. Start service
|
||||
|
||||
```bash
|
||||
cow update
|
||||
```
|
||||
|
||||
<Warning>
|
||||
If `git pull` fails (e.g., uncommitted local changes), the update aborts and the service remains unaffected.
|
||||
</Warning>
|
||||
|
||||
## status
|
||||
|
||||
Check CowAgent service status, including process info, version, and current model/channel configuration.
|
||||
|
||||
```bash
|
||||
cow status
|
||||
```
|
||||
|
||||
## logs
|
||||
|
||||
View service logs.
|
||||
|
||||
```bash
|
||||
cow logs
|
||||
```
|
||||
|
||||
**Options:**
|
||||
|
||||
| Option | Description | Default |
|
||||
| --- | --- | --- |
|
||||
| `-f`, `--follow` | Continuously tail log output | No |
|
||||
| `-n`, `--lines` | Show last N lines | 50 |
|
||||
|
||||
Examples:
|
||||
|
||||
```bash
|
||||
# View last 100 lines
|
||||
cow logs -n 100
|
||||
|
||||
# Continuously tail logs
|
||||
cow logs -f
|
||||
```
|
||||
|
||||
## install-browser
|
||||
|
||||
Install Playwright and Chromium browser for the [browser tool](/en/tools/browser).
|
||||
|
||||
```bash
|
||||
cow install-browser
|
||||
```
|
||||
|
||||
<Tip>
|
||||
Only needed when using browser tools (web browsing, screenshots, etc.).
|
||||
</Tip>
|
||||
|
||||
## run.sh Compatibility
|
||||
|
||||
If Cow CLI is not installed, you can use `run.sh` to manage the service:
|
||||
|
||||
| cow command | run.sh equivalent |
|
||||
| --- | --- |
|
||||
| `cow start` | `./run.sh start` |
|
||||
| `cow stop` | `./run.sh stop` |
|
||||
| `cow restart` | `./run.sh restart` |
|
||||
| `cow update` | `./run.sh update` |
|
||||
| `cow status` | `./run.sh status` |
|
||||
| `cow logs` | `./run.sh logs` |
|
||||
|
||||
<Note>
|
||||
The `cow` command is recommended — it provides cleaner syntax and richer features. It is automatically installed via the one-click install script.
|
||||
</Note>
|
||||
192
docs/en/commands/skill.mdx
Normal file
192
docs/en/commands/skill.mdx
Normal file
@@ -0,0 +1,192 @@
|
||||
---
|
||||
title: Skill Management
|
||||
description: Install, uninstall, enable, disable, and manage skills via commands
|
||||
---
|
||||
|
||||
Skill management commands are used to install, query, and manage CowAgent skills. Use `/skill <subcommand>` in chat or `cow skill <subcommand>` in the terminal.
|
||||
|
||||
## list
|
||||
|
||||
List installed skills and their status.
|
||||
|
||||
<CodeGroup>
|
||||
```text Chat
|
||||
/skill list
|
||||
```
|
||||
|
||||
```bash Terminal
|
||||
cow skill list
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
**Browse the Skill Hub** (view all available skills):
|
||||
|
||||
<CodeGroup>
|
||||
```text Chat
|
||||
/skill list --remote
|
||||
```
|
||||
|
||||
```bash Terminal
|
||||
cow skill list --remote
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
**Options:**
|
||||
|
||||
| Option | Description | Default |
|
||||
| --- | --- | --- |
|
||||
| `--remote`, `-r` | Browse Skill Hub remote skill list | No |
|
||||
| `--page` | Page number for remote listing | 1 |
|
||||
|
||||
## search
|
||||
|
||||
Search for skills on the Skill Hub.
|
||||
|
||||
<CodeGroup>
|
||||
```text Chat
|
||||
/skill search pptx
|
||||
```
|
||||
|
||||
```bash Terminal
|
||||
cow skill search pptx
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
## install
|
||||
|
||||
Install skills with a single `install` command from Cow Skill Hub, GitHub, ClawHub, or any URL (zip archives, SKILL.md links) — no manual download or configuration required.
|
||||
|
||||
**From Skill Hub (recommended):**
|
||||
|
||||
<CodeGroup>
|
||||
```text Chat
|
||||
/skill install pptx
|
||||
```
|
||||
|
||||
```bash Terminal
|
||||
cow skill install pptx
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
**From GitHub:**
|
||||
|
||||
<CodeGroup>
|
||||
```text Chat
|
||||
# Install all skills in a repo (auto-discovers subdirectories with SKILL.md)
|
||||
/skill install larksuite/cli
|
||||
|
||||
# Specify a subdirectory to install a single skill
|
||||
/skill install https://github.com/larksuite/cli/tree/main/skills/lark-im
|
||||
|
||||
# Use # to specify a subdirectory
|
||||
/skill install larksuite/cli#skills/lark-minutes
|
||||
```
|
||||
|
||||
```bash Terminal
|
||||
# Install all skills in a repo (auto-discovers subdirectories with SKILL.md)
|
||||
cow skill install larksuite/cli
|
||||
|
||||
# Specify a subdirectory to install a single skill
|
||||
cow skill install https://github.com/larksuite/cli/tree/main/skills/lark-im
|
||||
|
||||
# Use # to specify a subdirectory
|
||||
cow skill install larksuite/cli#skills/lark-minutes
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
Supports full GitHub URLs and `owner/repo` shorthand. For mono-repos (multiple skills in one repository), omitting the subdirectory auto-discovers and batch-installs all skills; specifying a subdirectory installs only that skill.
|
||||
|
||||
**From ClawHub:**
|
||||
|
||||
<CodeGroup>
|
||||
```text Chat
|
||||
/skill install clawhub:baidu-search
|
||||
```
|
||||
|
||||
```bash Terminal
|
||||
cow skill install clawhub:baidu-search
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
**From URL:**
|
||||
|
||||
<CodeGroup>
|
||||
```text Chat
|
||||
# Install from a zip archive (single or batch)
|
||||
/skill install https://cdn.link-ai.tech/skills/pptx.zip
|
||||
|
||||
# Install from a SKILL.md link
|
||||
/skill install https://example.com/path/to/SKILL.md
|
||||
```
|
||||
|
||||
```bash Terminal
|
||||
# Install from a zip archive (single or batch)
|
||||
cow skill install https://cdn.link-ai.tech/skills/pptx.zip
|
||||
|
||||
# Install from a SKILL.md link
|
||||
cow skill install https://example.com/path/to/SKILL.md
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
Supports installing from zip / tar.gz archive URLs — automatically extracts and discovers directories containing `SKILL.md`, with support for single or batch install. Also supports installing directly from a `SKILL.md` file URL, automatically parsing the skill name and description.
|
||||
|
||||
## uninstall
|
||||
|
||||
Uninstall an installed skill.
|
||||
|
||||
<CodeGroup>
|
||||
```text Chat
|
||||
/skill uninstall pptx
|
||||
```
|
||||
|
||||
```bash Terminal
|
||||
cow skill uninstall pptx
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
<Warning>
|
||||
Uninstalling deletes all files in the skill directory. This action cannot be undone.
|
||||
</Warning>
|
||||
|
||||
## enable / disable
|
||||
|
||||
Enable or disable a skill. Disabled skills will not be invoked by the Agent.
|
||||
|
||||
<CodeGroup>
|
||||
```text Chat
|
||||
/skill enable pptx
|
||||
/skill disable pptx
|
||||
```
|
||||
|
||||
```bash Terminal
|
||||
cow skill enable pptx
|
||||
cow skill disable pptx
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
## info
|
||||
|
||||
View details of an installed skill, including a preview of its `SKILL.md`.
|
||||
|
||||
<CodeGroup>
|
||||
```text Chat
|
||||
/skill info pptx
|
||||
```
|
||||
|
||||
```bash Terminal
|
||||
cow skill info pptx
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
## Skill Sources
|
||||
|
||||
Installed skills track their origin, viewable via `/skill list`:
|
||||
|
||||
| Source | Description |
|
||||
| --- | --- |
|
||||
| `builtin` | Built-in project skills |
|
||||
| `cowhub` | Installed from CowAgent Skill Hub |
|
||||
| `github` | Installed directly from a GitHub URL |
|
||||
| `clawhub` | Installed from ClawHub |
|
||||
| `url` | Installed from a SKILL.md URL |
|
||||
| `local` | Locally created skills |
|
||||
@@ -30,7 +30,25 @@ Optional dependencies (recommended):
|
||||
pip3 install -r requirements-optional.txt
|
||||
```
|
||||
|
||||
### 3. Configure
|
||||
### 3. Install Cow CLI
|
||||
|
||||
Install the command-line tool for managing services and skills:
|
||||
|
||||
```bash
|
||||
pip3 install -e .
|
||||
```
|
||||
|
||||
Then use the `cow` command:
|
||||
|
||||
```bash
|
||||
cow help
|
||||
```
|
||||
|
||||
<Note>
|
||||
This step is recommended. After installation you can use `cow start`, `cow stop`, `cow update` to manage the service, and `cow skill` to manage skills. Without the CLI, you can use `./run.sh` or `python3 app.py` to run.
|
||||
</Note>
|
||||
|
||||
### 4. Configure
|
||||
|
||||
Copy the config template and edit:
|
||||
|
||||
@@ -40,22 +58,32 @@ cp config-template.json config.json
|
||||
|
||||
Fill in model API keys, channel type, and other settings in `config.json`. See the [model docs](/en/models/index) for details.
|
||||
|
||||
### 4. Run
|
||||
### 5. Run
|
||||
|
||||
**Local run:**
|
||||
**Using Cow CLI (recommended):**
|
||||
|
||||
```bash
|
||||
cow start
|
||||
```
|
||||
|
||||
**Or run locally in foreground:**
|
||||
|
||||
```bash
|
||||
python3 app.py
|
||||
```
|
||||
|
||||
By default, the Web service starts. Access `http://localhost:9899/chat` to chat.
|
||||
By default, the Web console starts. Access `http://localhost:9899` to chat.
|
||||
|
||||
**Background run on server:**
|
||||
**Background run on server (without CLI):**
|
||||
|
||||
```bash
|
||||
nohup python3 app.py & tail -f nohup.out
|
||||
```
|
||||
|
||||
<Tip>
|
||||
If deploying on a server, open port `9899` in your firewall or security group to access the Web console. It's recommended to restrict access to specific IPs for security.
|
||||
</Tip>
|
||||
|
||||
## Docker Deployment
|
||||
|
||||
Docker deployment does not require cloning source code or installing dependencies. For Agent mode, source deployment is recommended for broader system access.
|
||||
@@ -84,6 +112,10 @@ sudo docker compose up -d
|
||||
sudo docker logs -f chatgpt-on-wechat
|
||||
```
|
||||
|
||||
<Tip>
|
||||
If deploying on a server, open port `9899` in your firewall or security group to access the Web console. It's recommended to restrict access to specific IPs for security.
|
||||
</Tip>
|
||||
|
||||
## Core Configuration
|
||||
|
||||
```json
|
||||
|
||||
@@ -9,31 +9,46 @@ Supports Linux, macOS, and Windows. Requires Python 3.7-3.12 (3.9 recommended).
|
||||
|
||||
## Install Command
|
||||
|
||||
```bash
|
||||
bash <(curl -fsSL https://cdn.link-ai.tech/code/cow/run.sh)
|
||||
```
|
||||
<Tabs>
|
||||
<Tab title="Linux / macOS">
|
||||
```bash
|
||||
bash <(curl -fsSL https://cdn.link-ai.tech/code/cow/run.sh)
|
||||
```
|
||||
</Tab>
|
||||
<Tab title="Windows (PowerShell)">
|
||||
```powershell
|
||||
irm https://cdn.link-ai.tech/code/cow/run.ps1 | iex
|
||||
```
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
The script automatically performs these steps:
|
||||
|
||||
1. Check Python environment (requires Python 3.7+)
|
||||
2. Install required tools (git, curl, etc.)
|
||||
3. Clone project to `~/chatgpt-on-wechat`
|
||||
4. Install Python dependencies
|
||||
4. Install Python dependencies and Cow CLI
|
||||
5. Guided configuration for AI model and channel
|
||||
6. Start service
|
||||
|
||||
By default, the Web service starts after installation. Access `http://localhost:9899/chat` to begin chatting.
|
||||
By default, the Web console starts after installation. Access `http://localhost:9899` to begin chatting.
|
||||
|
||||
## Management Commands
|
||||
|
||||
After installation, use these commands to manage the service:
|
||||
After installation, use the `cow` command to manage the service:
|
||||
|
||||
| Command | Description |
|
||||
| --- | --- |
|
||||
| `./run.sh start` | Start service |
|
||||
| `./run.sh stop` | Stop service |
|
||||
| `./run.sh restart` | Restart service |
|
||||
| `./run.sh status` | Check run status |
|
||||
| `./run.sh logs` | View real-time logs |
|
||||
| `./run.sh config` | Reconfigure |
|
||||
| `./run.sh update` | Update project code |
|
||||
| `cow start` | Start service |
|
||||
| `cow stop` | Stop service |
|
||||
| `cow restart` | Restart service |
|
||||
| `cow status` | Check run status |
|
||||
| `cow logs` | View real-time logs |
|
||||
| `cow update` | Update code and restart |
|
||||
| `cow install-browser` | Install browser tool dependencies |
|
||||
|
||||
See the [Commands documentation](/en/commands/index) for more details.
|
||||
|
||||
<Note>
|
||||
If the `cow` command is not available, you can use `./run.sh <command>` (Linux/macOS) or `.\scripts\run.ps1 <command>` (Windows) as a fallback. Both are functionally equivalent.
|
||||
</Note>
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
---
|
||||
title: Features
|
||||
description: CowAgent long-term memory, task planning, and skills system in detail
|
||||
description: CowAgent long-term memory, task planning, skills system, CLI commands, and browser tool in detail
|
||||
---
|
||||
|
||||
## 1. Long-term Memory
|
||||
@@ -19,7 +19,7 @@ In subsequent long-term conversations, the Agent intelligently stores or retriev
|
||||
|
||||
Tools are the core of how the Agent accesses operating system resources. The Agent intelligently selects and invokes tools based on task requirements, performing file read/write, command execution, scheduled tasks, and more. Built-in tools are implemented in the project's `agent/tools/` directory.
|
||||
|
||||
**Key tools:** file read/write/edit, Bash terminal, file send, scheduler, memory search, web search, environment config, and more.
|
||||
**Key tools:** file read/write/edit, Bash terminal, browser, file send, scheduler, memory search, web search, environment config, and more.
|
||||
|
||||
### 2.1 Terminal and File Access
|
||||
|
||||
@@ -45,7 +45,15 @@ The `scheduler` tool enables dynamic scheduled tasks, supporting **one-time task
|
||||
<img src="https://cdn.link-ai.tech/doc/20260202195402.png" width="800" />
|
||||
</Frame>
|
||||
|
||||
### 2.4 Environment Variable Management
|
||||
### 2.4 Browser
|
||||
|
||||
The built-in `browser` tool allows the Agent to control a Chromium browser to visit web pages, fill forms, click elements, and take screenshots, with support for dynamic JS-rendered pages. Run `cow install-browser` to install with one command, automatically adapting to server (headless) and desktop environments:
|
||||
|
||||
<Frame>
|
||||
<img src="https://cdn.link-ai.tech/doc/20260401110103.png" width="800" />
|
||||
</Frame>
|
||||
|
||||
### 2.5 Environment Variable Management
|
||||
|
||||
Secrets required by skills are stored in an environment variable file, managed by the `env_config` tool. You can update secrets through conversation, with built-in security protection and desensitization:
|
||||
|
||||
@@ -57,9 +65,12 @@ Secrets required by skills are stored in an environment variable file, managed b
|
||||
|
||||
The Skills system provides infinite extensibility for the Agent. Each Skill consists of a description file, execution scripts (optional), and resources (optional), describing how to complete specific types of tasks. Skills allow the Agent to follow instructions for complex workflows, invoke tools, or integrate third-party systems.
|
||||
|
||||
- **[Skill Hub](https://skills.cowagent.ai/):** An open skill marketplace featuring official, community, and third-party skills. Install with one command.
|
||||
- **Built-in skills:** Located in the project's `skills/` directory, including skill creator, image recognition, LinkAI agent, web fetch, and more. Built-in skills are automatically enabled based on dependency conditions (API keys, system commands, etc.).
|
||||
- **Custom skills:** Created by users through conversation, stored in the workspace (`~/cow/skills/`), capable of implementing any complex business process or third-party integration.
|
||||
|
||||
Install skills: `/skill install <name>` or `cow skill install <name>`, supporting Skill Hub, GitHub, ClawHub, URL, and more.
|
||||
|
||||
### 3.1 Creating Skills
|
||||
|
||||
The `skill-creator` skill enables rapid skill creation through conversation. You can ask the Agent to codify a workflow as a skill, or send any API documentation and examples for the Agent to complete the integration directly:
|
||||
@@ -77,29 +88,33 @@ The `skill-creator` skill enables rapid skill creation through conversation. You
|
||||
<img src="https://cdn.link-ai.tech/doc/20260202213219.png" width="800" />
|
||||
</Frame>
|
||||
|
||||
### 3.3 Third-party Knowledge Bases and Plugins
|
||||
### 3.3 Skill Hub
|
||||
|
||||
The `linkai-agent` skill makes all agents on [LinkAI](https://link-ai.tech/) available as Skills for the Agent, enabling multi-agent decision making.
|
||||
Visit [skills.cowagent.ai](https://skills.cowagent.ai/) to browse all available skills, or use commands in conversation:
|
||||
|
||||
Configuration: set `LINKAI_API_KEY` via `env_config`, then add agent descriptions in `skills/linkai-agent/config.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"apps": [
|
||||
{
|
||||
"app_code": "G7z6vKwp",
|
||||
"app_name": "LinkAI Customer Support",
|
||||
"app_description": "Select only when the user needs help with LinkAI platform questions"
|
||||
},
|
||||
{
|
||||
"app_code": "SFY5x7JR",
|
||||
"app_name": "Content Creator",
|
||||
"app_description": "Use only when the user needs to create images or videos"
|
||||
}
|
||||
]
|
||||
}
|
||||
```text
|
||||
/skill list --remote # Browse Skill Hub
|
||||
/skill search <keyword> # Search skills
|
||||
/skill install <name> # Install with one command
|
||||
```
|
||||
|
||||
<Frame>
|
||||
<img src="https://cdn.link-ai.tech/doc/20260202234350.png" width="750" />
|
||||
</Frame>
|
||||
Also supports installing skills from GitHub, ClawHub, LinkAI, and other third-party platforms. See [Install Skills](/en/skills/install) for details.
|
||||
|
||||
<img src="https://cdn.link-ai.tech/doc/20260401110103.png" width="750" />
|
||||
|
||||
## 4. CLI Command System
|
||||
|
||||
CowAgent provides two command interaction methods, covering service management, skill installation, configuration, and more:
|
||||
|
||||
- **Terminal CLI:** Run `cow <command>` in the system terminal, supporting `start`, `stop`, `restart`, `update`, `status`, `logs`, `skill`, etc.
|
||||
- **Chat commands:** Type `/<command>` in conversation. The Web console shows a command menu when you type `/`.
|
||||
|
||||
```bash
|
||||
cow start # Start service
|
||||
cow stop # Stop service
|
||||
cow update # Update and restart
|
||||
cow skill install pptx # Install a skill
|
||||
cow install-browser # Install browser tool
|
||||
```
|
||||
|
||||
See [Command Overview](https://docs.cowagent.ai/en/commands) for details.
|
||||
|
||||
@@ -28,6 +28,12 @@ CowAgent can proactively think and plan tasks, operate computers and external re
|
||||
<Card title="Multimodal Messages" icon="image" href="/en/channels/web">
|
||||
Supports parsing, processing, generating, and sending text, images, voice, files, and other message types.
|
||||
</Card>
|
||||
<Card title="Tool System" icon="wrench" href="/en/tools/index">
|
||||
Built-in tools for file I/O, terminal execution, browser automation, scheduled tasks, messaging, and more. The Agent autonomously invokes tools to accomplish complex tasks.
|
||||
</Card>
|
||||
<Card title="Command System" icon="terminal" href="/en/commands/index">
|
||||
Provides terminal CLI and in-chat commands for process management, skill installation, configuration, context inspection, and other common operations.
|
||||
</Card>
|
||||
<Card title="Multiple Model Support" icon="microchip" href="/en/models/index">
|
||||
Supports mainstream model providers including OpenAI, Claude, Gemini, DeepSeek, MiniMax, GLM, Qwen, Kimi, Doubao, and more.
|
||||
</Card>
|
||||
@@ -40,9 +46,18 @@ CowAgent can proactively think and plan tasks, operate computers and external re
|
||||
|
||||
Run the following command in your terminal for one-click install, configuration, and startup:
|
||||
|
||||
```bash
|
||||
bash <(curl -fsSL https://cdn.link-ai.tech/code/cow/run.sh)
|
||||
```
|
||||
<Tabs>
|
||||
<Tab title="Linux / macOS">
|
||||
```bash
|
||||
bash <(curl -fsSL https://cdn.link-ai.tech/code/cow/run.sh)
|
||||
```
|
||||
</Tab>
|
||||
<Tab title="Windows (PowerShell)">
|
||||
```powershell
|
||||
irm https://cdn.link-ai.tech/code/cow/run.ps1 | iex
|
||||
```
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
By default, the Web service starts after running. Access `http://localhost:9899/chat` to chat in the web interface.
|
||||
|
||||
|
||||
80
docs/en/memory/context.mdx
Normal file
80
docs/en/memory/context.mdx
Normal file
@@ -0,0 +1,80 @@
|
||||
---
|
||||
title: Short-term Memory
|
||||
description: Conversation context — message management, compression strategies, and context operations
|
||||
---
|
||||
|
||||
Conversation context is the Agent's short-term memory, containing all messages in the current session (user input, Agent replies, tool calls and results). Proper context management is critical for the Agent's reasoning quality and cost control.
|
||||
|
||||
## Context Structure
|
||||
|
||||
Each conversation turn consists of:
|
||||
|
||||
```
|
||||
User message → Agent thinking → Tool call → Tool result → ... → Agent final reply
|
||||
```
|
||||
|
||||
A single turn may include multiple tool calls (controlled by `agent_max_steps`). All tool calls and results are retained in context until compressed or trimmed.
|
||||
|
||||
## Key Configuration
|
||||
|
||||
| Parameter | Description | Default |
|
||||
| --- | --- | --- |
|
||||
| `agent_max_context_tokens` | Maximum context token budget | `50000` |
|
||||
| `agent_max_context_turns` | Maximum conversation turns in context | `20` |
|
||||
| `agent_max_steps` | Maximum decision steps per turn (tool call count) | `15` |
|
||||
|
||||
Configurable via `config.json` or the `/config` chat command.
|
||||
|
||||
## Compression Strategy
|
||||
|
||||
When context exceeds limits, the system automatically compresses to free space. The process has multiple stages:
|
||||
|
||||
### 1. Tool Result Truncation
|
||||
|
||||
Before each decision loop, the system checks tool call results in historical turns. Results exceeding **20,000 characters** are truncated, keeping only the beginning and end with a truncation notice. Current turn results are not affected.
|
||||
|
||||
### 2. Turn Trimming
|
||||
|
||||
When conversation turns exceed `agent_max_context_turns`:
|
||||
|
||||
- The **oldest half** of complete turns is trimmed (preserving tool call chain integrity)
|
||||
- Trimmed messages are summarized by LLM and **written to the daily memory file**
|
||||
- Remaining turns stay intact
|
||||
|
||||
### 3. Token Budget Trimming
|
||||
|
||||
After turn trimming, if tokens still exceed the budget:
|
||||
|
||||
- **Fewer than 5 turns**: All turns undergo **text compression** — each turn keeps only the first user text and last Agent reply, removing intermediate tool call chains
|
||||
- **5 or more turns**: The **first half** of turns is trimmed again, with discarded content also written to memory
|
||||
|
||||
### 4. Overflow Emergency Handling
|
||||
|
||||
When the model API returns a context overflow error:
|
||||
|
||||
1. All current messages are summarized and written to memory
|
||||
2. Aggressive trimming is applied (tool results limited to 10K chars, user text to 10K, max 5 turns)
|
||||
3. If still overflowing, the entire conversation context is cleared
|
||||
|
||||
## Session Persistence
|
||||
|
||||
Conversation messages are persisted to a local database, automatically restored after service restart. Restore strategy:
|
||||
|
||||
- Restores the most recent **`max(3, max_context_turns / 6)`** turns
|
||||
- Only retains each turn's **user text and Agent final reply**, not intermediate tool call chains
|
||||
- Sessions older than **30 days** are automatically cleaned up
|
||||
|
||||
## Commands
|
||||
|
||||
Use these commands in chat to manage context:
|
||||
|
||||
| Command | Description |
|
||||
| --- | --- |
|
||||
| `/context` | View current context statistics (message count, role distribution, total characters) |
|
||||
| `/context clear` | Clear current session context |
|
||||
| `/config agent_max_context_tokens 80000` | Adjust context token budget |
|
||||
| `/config agent_max_context_turns 30` | Adjust context turn limit |
|
||||
|
||||
<Tip>
|
||||
After clearing context, the Agent "forgets" previous conversation content. Content that was already written to long-term memory can still be retrieved via memory search.
|
||||
</Tip>
|
||||
@@ -1,30 +1,39 @@
|
||||
---
|
||||
title: Memory
|
||||
description: CowAgent long-term memory system
|
||||
title: Long-term Memory
|
||||
description: CowAgent long-term memory system — file persistence, automatic writing, and hybrid retrieval
|
||||
---
|
||||
|
||||
The memory system enables the Agent to remember important information over time, continuously accumulating experience, understanding user preferences, and truly achieving autonomous thinking and continuous growth.
|
||||
Long-term memory is stored in workspace files, persisting across sessions. The Agent loads historical memory on demand via retrieval tools during conversation, and automatically writes conversation summaries to long-term memory when context is trimmed.
|
||||
|
||||
## Memory Types
|
||||
|
||||
### Core Memory (MEMORY.md)
|
||||
|
||||
Stored in `~/cow/MEMORY.md`, containing long-term user preferences, important decisions, key facts, and other information that doesn't fade over time. Automatically injected into the system prompt on every conversation turn as background knowledge.
|
||||
Stored in `~/cow/MEMORY.md`, containing long-term user preferences, important decisions, key facts, and other information that doesn't fade over time. The Agent reads and writes this file via tools to maintain long-term knowledge.
|
||||
|
||||
### Daily Memory (memory/YYYY-MM-DD.md)
|
||||
|
||||
Stored in `~/cow/memory/` directory, named by date (e.g. `2026-03-08.md`), recording daily conversation summaries and key events. Files are only created on first write to avoid generating empty files.
|
||||
Stored in `~/cow/memory/` directory, named by date (e.g., `2026-03-08.md`), recording daily conversation summaries and key events. Files are only created on first write to avoid generating empty files.
|
||||
|
||||
## Memory Writing
|
||||
## Automatic Writing
|
||||
|
||||
The Agent automatically persists conversation content to daily memory through the following mechanisms:
|
||||
The Agent automatically persists conversation content to long-term memory through the following mechanisms:
|
||||
|
||||
- **On context trimming** — When conversation turns or tokens exceed the configured limit, the oldest half of the context is trimmed in batch, and the discarded content is summarized by LLM into key information and written to the daily memory file
|
||||
- **On context trimming** — When conversation turns or tokens exceed the configured limit, the oldest half of the context is trimmed, and the discarded content is summarized by LLM into key information and written to the daily memory file
|
||||
- **Daily scheduled summary** — A full summary is automatically triggered at 23:55 every day, ensuring memory is preserved even on low-activity days (skipped if content hasn't changed)
|
||||
- **On API context overflow** — When the model API returns a context overflow error, the current conversation summary is saved as an emergency measure
|
||||
|
||||
All memory writes run asynchronously in a background thread (LLM summarization + file writing), never blocking normal conversation replies.
|
||||
|
||||
## Memory Retrieval
|
||||
|
||||
The memory system supports hybrid retrieval modes:
|
||||
|
||||
- **Keyword retrieval** — FTS5 full-text index matching with BM25 ranking
|
||||
- **Vector retrieval** — Embedding-based semantic similarity search, finds relevant memory even with different wording
|
||||
|
||||
The Agent automatically triggers memory retrieval during conversation as needed, incorporating relevant historical information into context. Results are ranked by a combined score (default: 0.7 vector weight + 0.3 keyword weight). Daily memory scores decay over time (30-day half-life), while core memory does not decay.
|
||||
|
||||
## First Launch
|
||||
|
||||
On first launch, the Agent will proactively ask the user for key information and save it to the workspace (default `~/cow`):
|
||||
@@ -40,27 +49,10 @@ On first launch, the Agent will proactively ask the user for key information and
|
||||
<img src="https://cdn.link-ai.tech/doc/20260203000455.png" width="800" />
|
||||
</Frame>
|
||||
|
||||
## Memory Retrieval
|
||||
|
||||
The memory system supports hybrid retrieval modes:
|
||||
|
||||
- **Keyword retrieval** — Match historical memory based on keywords
|
||||
- **Vector retrieval** — Semantic similarity search, finds relevant memory even with different wording
|
||||
|
||||
The Agent automatically triggers memory retrieval during conversation as needed, incorporating relevant historical information into context. Core memory (`MEMORY.md`) is always injected into the system prompt, while daily memory is loaded on demand via retrieval.
|
||||
|
||||
## Configuration
|
||||
|
||||
```json
|
||||
{
|
||||
"agent_workspace": "~/cow",
|
||||
"agent_max_context_tokens": 40000,
|
||||
"agent_max_context_turns": 20
|
||||
}
|
||||
```
|
||||
|
||||
| Parameter | Description | Default |
|
||||
| --- | --- | --- |
|
||||
| `agent_workspace` | Workspace path, memory files stored under this directory | `~/cow` |
|
||||
| `agent_max_context_tokens` | Max context tokens; when exceeded, half is trimmed and summarized into memory | `40000` |
|
||||
| `agent_max_context_turns` | Max context turns; when exceeded, half is trimmed and summarized into memory | `20` |
|
||||
| `agent_max_context_tokens` | Max context tokens; when exceeded, content is trimmed and summarized into memory | `50000` |
|
||||
| `agent_max_context_turns` | Max context turns; when exceeded, content is trimmed and summarized into memory | `20` |
|
||||
@@ -5,6 +5,7 @@ description: CowAgent version history
|
||||
|
||||
| Version | Date | Description |
|
||||
| --- | --- | --- |
|
||||
| [2.0.5](/en/releases/v2.0.5) | 2026.04.01 | Cow CLI, Skill Hub open source, Browser tool, WeCom Bot QR scan, and more |
|
||||
| [2.0.4](/en/releases/v2.0.4) | 2026.03.22 | Personal WeChat channel, new model support, Japanese docs, script refactoring and bug fixes |
|
||||
| [2.0.2](/en/releases/v2.0.2) | 2026.02.27 | Web Console upgrade, multi-channel concurrency, session persistence |
|
||||
| [2.0.1](/en/releases/v2.0.1) | 2026.02.27 | Built-in Web Search tool, smart context management, multiple fixes |
|
||||
|
||||
77
docs/en/releases/v2.0.5.mdx
Normal file
77
docs/en/releases/v2.0.5.mdx
Normal file
@@ -0,0 +1,77 @@
|
||||
---
|
||||
title: v2.0.5
|
||||
description: CowAgent 2.0.5 - Cow CLI, Skill Hub open source, Browser tool, WeCom Bot QR scan, and more
|
||||
---
|
||||
|
||||
## 🖥️ Cow CLI
|
||||
|
||||
New CLI command system for managing CowAgent from terminal and chat:
|
||||
|
||||
- **Terminal commands**: Run `cow <command>` for `start`, `stop`, `restart`, `update`, `status`, `logs`, etc.
|
||||
- **Chat commands**: Type `/<command>` in conversation for `/help`, `/status`, `/config`, `/skill`, `/context`, `/logs`, `/version`, etc.
|
||||
- **Web console**: Type `/` in the input box to open a slash command menu, with arrow-key input history
|
||||
- **Windows support**: New PowerShell script `scripts/run.ps1` with `cow` command support
|
||||
|
||||
Docs: [Command Overview](https://docs.cowagent.ai/en/commands)
|
||||
|
||||
<img src="https://cdn.link-ai.tech/doc/20260401114549.png" width="750" />
|
||||
|
||||
## 🧩 Cow Skill Hub Open Source
|
||||
|
||||
[Cow Skill Hub](https://skills.cowagent.ai) is now open source and live — browse, search, install, and publish AI Agent skills:
|
||||
|
||||
- **One-command install**: `/skill install <name>` in chat or `cow skill install <name>` in terminal
|
||||
- **Multi-source**: Install from Skill Hub, GitHub, ClawHub, LinkAI, and more
|
||||
- **Search**: `/skill search` and `/skill list --remote` to browse the hub
|
||||
- **Publish**: Submit your own skills at [skills.cowagent.ai/submit](https://skills.cowagent.ai/submit)
|
||||
- **Mirror**: Mirror acceleration for faster downloads in China
|
||||
|
||||
Open source repo: [cow-skill-hub](https://github.com/zhayujie/cow-skill-hub)
|
||||
|
||||
Docs: [Skill Hub](https://docs.cowagent.ai/en/skills/hub), [Install Skills](https://docs.cowagent.ai/en/skills/install)
|
||||
|
||||
<img src="https://cdn.link-ai.tech/doc/20260401110103.png" width="750" />
|
||||
|
||||
## 🌐 Browser Tool
|
||||
|
||||
New Browser tool — Agent can control a Chromium browser to visit and interact with web pages:
|
||||
|
||||
- **Navigation & interaction**: `navigate`, `click`, `fill`, `select`, `scroll`, `press`, etc.
|
||||
- **Page snapshot**: Compact DOM snapshot for efficient page understanding, auto-snapshot after navigation
|
||||
- **Screenshot**: Save page screenshots to workspace
|
||||
- **JavaScript execution**: Run custom scripts on pages
|
||||
- **CLI install**: `cow install-browser` for one-command setup
|
||||
- **Docker support**: Browser install built into Docker image
|
||||
|
||||
Docs: [Browser Tool](https://docs.cowagent.ai/en/tools/browser)
|
||||
|
||||
<img src="https://cdn.link-ai.tech/doc/20260401115728.png" width="750" />
|
||||
|
||||
## 🤖 WeCom Bot QR Code Setup
|
||||
|
||||
WeCom Bot channel now supports QR code scan for one-click bot creation:
|
||||
|
||||
- **QR scan in Web console**: Select "Scan QR" mode, scan with WeCom to auto-create and connect a bot — no manual configuration needed
|
||||
- **Manual mode**: Still supports manual Bot ID and Secret input
|
||||
- **Stream push optimization**: Throttled push to avoid WebSocket congestion
|
||||
|
||||
Docs: [WeCom Bot](https://docs.cowagent.ai/en/channels/wecom-bot)
|
||||
|
||||
PR: [#2735](https://github.com/zhayujie/chatgpt-on-wechat/pull/2735). Thanks [@WecomTeam](https://github.com/WecomTeam)
|
||||
|
||||
## 🐛 Other Improvements & Fixes
|
||||
|
||||
- **DeepSeek module**: Independent DeepSeek Bot with dedicated `deepseek_api_key` config ([#2719](https://github.com/zhayujie/chatgpt-on-wechat/pull/2719)). Thanks [@6vision](https://github.com/6vision)
|
||||
- **Web console**: Slash command menu, input history, new model options, mobile optimization ([#2731](https://github.com/zhayujie/chatgpt-on-wechat/pull/2731)). Thanks [@zkjqd](https://github.com/zkjqd)
|
||||
- **Context loss**: Fix context loss after trimming ([393f0c0](https://github.com/zhayujie/chatgpt-on-wechat/commit/393f0c0))
|
||||
- **System prompt**: Fix system prompt not rebuilding on every turn ([13f5fde](https://github.com/zhayujie/chatgpt-on-wechat/commit/13f5fde))
|
||||
- **Gemini**: Fix missing model attribute in GoogleGeminiBot ([#2716](https://github.com/zhayujie/chatgpt-on-wechat/pull/2716)). Thanks [@cowagent](https://github.com/cowagent)
|
||||
- **WeChat channel**: Fix file send failures and filename loss ([6d9b7ba](https://github.com/zhayujie/chatgpt-on-wechat/commit/6d9b7ba), [45faa9c](https://github.com/zhayujie/chatgpt-on-wechat/commit/45faa9c))
|
||||
- **Docker**: Fix volume permissions, reduce image size ([3eb8348](https://github.com/zhayujie/chatgpt-on-wechat/commit/3eb8348), [4470d4c](https://github.com/zhayujie/chatgpt-on-wechat/commit/4470d4c))
|
||||
- **Security**: Fix Memory Content path traversal risk. Thanks [@August829](https://github.com/August829)
|
||||
|
||||
## 📦 Upgrade
|
||||
|
||||
Run `cow update` or `./run.sh update` to upgrade, or pull the latest code and restart. See [Upgrade Guide](https://docs.cowagent.ai/en/guide/upgrade).
|
||||
|
||||
**Release Date**: 2026.04.01 | [Full Changelog](https://github.com/zhayujie/chatgpt-on-wechat/compare/2.0.4...master)
|
||||
58
docs/en/skills/create.mdx
Normal file
58
docs/en/skills/create.mdx
Normal file
@@ -0,0 +1,58 @@
|
||||
---
|
||||
title: Create Skills
|
||||
description: Create custom skills through conversation
|
||||
---
|
||||
|
||||
CowAgent includes a built-in Skill Creator that lets you quickly create, install, or update skills through natural language conversation.
|
||||
|
||||
## Usage
|
||||
|
||||
Simply describe the skill you want in a conversation, and the Agent will handle the creation:
|
||||
|
||||
- Codify workflows as skills: "Create a skill from this deployment process"
|
||||
- Integrate third-party APIs: "Create a skill based on this API documentation"
|
||||
- Install remote skills: "Install xxx skill for me"
|
||||
|
||||
## Creation Flow
|
||||
|
||||
1. Tell the Agent what skill you want to create
|
||||
2. Agent automatically generates `SKILL.md` description and execution scripts
|
||||
3. Skill is saved to the workspace `~/cow/skills/` directory
|
||||
4. Agent will automatically recognize and use the skill in future conversations
|
||||
|
||||
<Frame>
|
||||
<img src="https://cdn.link-ai.tech/doc/20260202202247.png" width="800" />
|
||||
</Frame>
|
||||
|
||||
## SKILL.md Format
|
||||
|
||||
Created skills follow the standard SKILL.md format:
|
||||
|
||||
```markdown
|
||||
---
|
||||
name: my-skill
|
||||
description: Brief description of the skill
|
||||
metadata:
|
||||
emoji: 🔧
|
||||
requires:
|
||||
bins: ["curl"]
|
||||
env: ["MY_API_KEY"]
|
||||
primaryEnv: "MY_API_KEY"
|
||||
---
|
||||
|
||||
# My Skill
|
||||
|
||||
Detailed instructions...
|
||||
```
|
||||
|
||||
| Field | Description |
|
||||
| --- | --- |
|
||||
| `name` | Skill name, must match directory name |
|
||||
| `description` | Skill description, Agent decides whether to invoke based on this |
|
||||
| `metadata.requires.bins` | Required system commands |
|
||||
| `metadata.requires.env` | Required environment variables |
|
||||
| `metadata.always` | Always load (default false) |
|
||||
|
||||
<Tip>
|
||||
See the [Skill Creator documentation](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/skills/skill-creator/SKILL.md) for details.
|
||||
</Tip>
|
||||
@@ -1,31 +0,0 @@
|
||||
---
|
||||
title: Image Vision
|
||||
description: Recognize images using OpenAI vision models
|
||||
---
|
||||
|
||||
Analyze image content using OpenAI's GPT-4 Vision API, understanding objects, text, colors, and other elements in images.
|
||||
|
||||
## Dependencies
|
||||
|
||||
| Dependency | Description |
|
||||
| --- | --- |
|
||||
| `OPENAI_API_KEY` | OpenAI API key |
|
||||
| `curl`, `base64` | System commands (usually pre-installed) |
|
||||
|
||||
Configuration:
|
||||
|
||||
- Configure `OPENAI_API_KEY` via the `env_config` tool
|
||||
- Or set `open_ai_api_key` in `config.json`
|
||||
|
||||
## Supported Models
|
||||
|
||||
- `gpt-4.1-mini` (recommended, cost-effective)
|
||||
- `gpt-4.1`
|
||||
|
||||
## Usage
|
||||
|
||||
Once configured, send an image to the Agent to automatically trigger image recognition.
|
||||
|
||||
<Frame>
|
||||
<img src="https://cdn.link-ai.tech/doc/20260202213219.png" width="800" />
|
||||
</Frame>
|
||||
@@ -7,20 +7,17 @@ Skills provide infinite extensibility for the Agent. Each Skill consists of a de
|
||||
|
||||
The difference between Skills and Tools: Tools are atomic operations implemented in code (e.g., file read/write, command execution), while Skills are high-level workflows based on description files that can combine multiple Tools to complete complex tasks.
|
||||
|
||||
## Built-in Skills
|
||||
## Getting Skills
|
||||
|
||||
Located in the project `skills/` directory, automatically enabled based on dependency conditions:
|
||||
CowAgent offers multiple ways to acquire skills:
|
||||
|
||||
| Skill | Description | Dependencies |
|
||||
| --- | --- | --- |
|
||||
| [`skill-creator`](/en/skills/skill-creator) | Create custom skills through conversation | None |
|
||||
| [`openai-image-vision`](/en/skills/image-vision) | Recognize images using OpenAI vision models | `OPENAI_API_KEY` |
|
||||
| [`linkai-agent`](/en/skills/linkai-agent) | Integrate LinkAI platform agents | `LINKAI_API_KEY` |
|
||||
| [`web-fetch`](/en/skills/web-fetch) | Fetch web page text content | `curl` (enabled by default) |
|
||||
- **Cow Skill Hub** — Browse and install community skills via `/skill list --remote`
|
||||
- **GitHub** — Install directly from GitHub repositories, with batch install support
|
||||
- **ClawHub** — Install ClawHub skills via `/skill install clawhub:name`
|
||||
- **URL** — Install from zip archives or SKILL.md links
|
||||
- **Conversational creation** — Let the Agent create skills through natural language conversation
|
||||
|
||||
## Custom Skills
|
||||
|
||||
Created by users through conversation, stored in workspace (`~/cow/skills/`), can implement any complex business process and third-party system integration.
|
||||
See [Install Skills](/en/skills/install) and [Skill Management Commands](/en/commands/skill) for details. You can also [create skills](/en/skills/create) through conversation.
|
||||
|
||||
## Skill Loading Priority
|
||||
|
||||
|
||||
53
docs/en/skills/install.mdx
Normal file
53
docs/en/skills/install.mdx
Normal file
@@ -0,0 +1,53 @@
|
||||
---
|
||||
title: Install Skills
|
||||
description: Install skills from multiple sources with a single command
|
||||
---
|
||||
|
||||
CowAgent supports installing skills from **Cow Skill Hub, GitHub, ClawHub**, and any URL with a unified `install` command. Use `/skill install` in chat or `cow skill install` in the terminal.
|
||||
|
||||
## From Skill Hub
|
||||
|
||||
Browse the Skill Hub and install:
|
||||
|
||||
```text
|
||||
/skill list --remote
|
||||
/skill install pptx
|
||||
```
|
||||
|
||||
## From GitHub
|
||||
|
||||
Supports batch install from repositories and single skill from subdirectories:
|
||||
|
||||
```text
|
||||
/skill install larksuite/cli
|
||||
/skill install https://github.com/larksuite/cli/tree/main/skills/lark-im
|
||||
```
|
||||
|
||||
## From ClawHub
|
||||
|
||||
```text
|
||||
/skill install clawhub:baidu-search
|
||||
```
|
||||
|
||||
## From URL
|
||||
|
||||
Supports zip archives and SKILL.md file links:
|
||||
|
||||
```text
|
||||
/skill install https://cdn.link-ai.tech/skills/pptx.zip
|
||||
/skill install https://example.com/path/to/SKILL.md
|
||||
```
|
||||
|
||||
## Manage Skills
|
||||
|
||||
```text
|
||||
/skill list # View installed skills
|
||||
/skill info pptx # View skill details
|
||||
/skill enable pptx # Enable a skill
|
||||
/skill disable pptx # Disable a skill
|
||||
/skill uninstall pptx # Uninstall a skill
|
||||
```
|
||||
|
||||
<Tip>
|
||||
All commands above work in the terminal by replacing `/skill` with `cow skill`. See [Skill Management Commands](/en/commands/skill) for full documentation.
|
||||
</Tip>
|
||||
@@ -1,47 +0,0 @@
|
||||
---
|
||||
title: LinkAI Agent
|
||||
description: Integrate LinkAI platform multi-agent skill
|
||||
---
|
||||
|
||||
Use agents from the [LinkAI](https://link-ai.tech/) platform as Skills for multi-agent decision-making. The Agent intelligently selects based on agent names and descriptions, calling the corresponding application or workflow via `app_code`.
|
||||
|
||||
## Dependencies
|
||||
|
||||
| Dependency | Description |
|
||||
| --- | --- |
|
||||
| `LINKAI_API_KEY` | LinkAI platform API key, created in [Console](https://link-ai.tech/console/interface) |
|
||||
| `curl` | System command (usually pre-installed) |
|
||||
|
||||
Configuration:
|
||||
|
||||
- Configure `LINKAI_API_KEY` via the `env_config` tool
|
||||
- Or set `linkai_api_key` in `config.json`
|
||||
|
||||
## Configure Agents
|
||||
|
||||
Add available agents in `skills/linkai-agent/config.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
"apps": [
|
||||
{
|
||||
"app_code": "G7z6vKwp",
|
||||
"app_name": "LinkAI Customer Support",
|
||||
"app_description": "Select this assistant only when the user needs help with LinkAI platform questions"
|
||||
},
|
||||
{
|
||||
"app_code": "SFY5x7JR",
|
||||
"app_name": "Content Creator",
|
||||
"app_description": "Use this assistant only when the user needs to create images or videos"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
Once configured, the Agent will automatically select the appropriate LinkAI agent based on the user's question.
|
||||
|
||||
<Frame>
|
||||
<img src="https://cdn.link-ai.tech/doc/20260202234350.png" width="750" />
|
||||
</Frame>
|
||||
@@ -1,31 +0,0 @@
|
||||
---
|
||||
title: Skill Creator
|
||||
description: Create custom skills through conversation
|
||||
---
|
||||
|
||||
Quickly create, install, or update skills through natural language conversation.
|
||||
|
||||
## Dependencies
|
||||
|
||||
No extra dependencies, always available.
|
||||
|
||||
## Usage
|
||||
|
||||
- Codify workflows as skills: "Create a skill from this deployment process"
|
||||
- Integrate third-party APIs: "Create a skill based on this API documentation"
|
||||
- Install remote skills: "Install xxx skill for me"
|
||||
|
||||
## Creation Flow
|
||||
|
||||
1. Tell the Agent what skill you want to create
|
||||
2. Agent automatically generates `SKILL.md` description and execution scripts
|
||||
3. Skill is saved to the workspace `~/cow/skills/` directory
|
||||
4. Agent will automatically recognize and use the skill in future conversations
|
||||
|
||||
<Frame>
|
||||
<img src="https://cdn.link-ai.tech/doc/20260202202247.png" width="800" />
|
||||
</Frame>
|
||||
|
||||
<Tip>
|
||||
See the [Skill Creator documentation](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/skills/skill-creator/SKILL.md) for details.
|
||||
</Tip>
|
||||
@@ -1,31 +0,0 @@
|
||||
---
|
||||
title: Web Fetch
|
||||
description: Fetch web page text content
|
||||
---
|
||||
|
||||
Use curl to fetch web pages and extract readable text content. A lightweight web access method without browser automation.
|
||||
|
||||
## Dependencies
|
||||
|
||||
| Dependency | Description |
|
||||
| --- | --- |
|
||||
| `curl` | System command (usually pre-installed) |
|
||||
|
||||
This skill has `always: true` set, enabled by default as long as the system has the `curl` command.
|
||||
|
||||
## Usage
|
||||
|
||||
Automatically invoked when the Agent needs to fetch content from a URL, no extra configuration needed.
|
||||
|
||||
## Comparison with browser Tool
|
||||
|
||||
| Feature | web-fetch (skill) | browser (tool) |
|
||||
| --- | --- | --- |
|
||||
| Dependencies | curl only | browser-use + playwright |
|
||||
| JS rendering | Not supported | Supported |
|
||||
| Page interaction | Not supported | Supports click, type, etc. |
|
||||
| Best for | Static page text | Dynamic web pages |
|
||||
|
||||
<Tip>
|
||||
For most web content retrieval scenarios, web-fetch is sufficient. Only use the browser tool when you need JS rendering or page interaction.
|
||||
</Tip>
|
||||
@@ -32,7 +32,39 @@ pip3 install -r requirements-optional.txt
|
||||
|
||||
> 国内网络可使用镜像源加速:`pip3 install -r requirements.txt -i https://pypi.tuna.tsinghua.edu.cn/simple`
|
||||
|
||||
### 3. 配置
|
||||
### 3. 安装 Cow CLI
|
||||
|
||||
安装命令行工具,用于管理服务和技能:
|
||||
|
||||
```bash
|
||||
pip3 install -e .
|
||||
```
|
||||
|
||||
安装后即可使用 `cow` 命令:
|
||||
|
||||
```bash
|
||||
cow help
|
||||
```
|
||||
|
||||
<Note>
|
||||
此步骤为推荐操作。安装后可以使用 `cow start`、`cow stop`、`cow update` 等命令管理服务,也可以使用 `cow skill` 管理技能。如果不安装 CLI,可以使用 `./run.sh` 或 `python3 app.py` 运行。
|
||||
</Note>
|
||||
|
||||
### 3.1 安装浏览器工具(可选)
|
||||
|
||||
如需使用浏览器工具(控制浏览器访问网页、填写表单等),运行:
|
||||
|
||||
```bash
|
||||
cow install-browser
|
||||
```
|
||||
|
||||
该命令会自动安装 Playwright 和 Chromium 浏览器。详细说明参考 [浏览器工具文档](/tools/browser)。
|
||||
|
||||
<Note>
|
||||
浏览器工具依赖较重(~300MB),如不需要可跳过,不影响其他功能正常使用。
|
||||
</Note>
|
||||
|
||||
### 4. 配置
|
||||
|
||||
复制配置文件模板并编辑:
|
||||
|
||||
@@ -42,9 +74,15 @@ cp config-template.json config.json
|
||||
|
||||
在 `config.json` 中填写模型 API Key 和通道类型等配置,详细说明参考各 [模型文档](/models/minimax)。
|
||||
|
||||
### 4. 运行
|
||||
### 5. 运行
|
||||
|
||||
**本地运行:**
|
||||
**使用 Cow CLI 运行(推荐):**
|
||||
|
||||
```bash
|
||||
cow start
|
||||
```
|
||||
|
||||
**或者本地前台运行:**
|
||||
|
||||
```bash
|
||||
python3 app.py
|
||||
@@ -52,7 +90,7 @@ python3 app.py
|
||||
|
||||
运行后默认启动 Web 控制台,访问 `http://localhost:9899` 开始对话和管理Agent。
|
||||
|
||||
**服务器后台运行:**
|
||||
**服务器后台运行(不使用 CLI 时):**
|
||||
|
||||
```bash
|
||||
nohup python3 app.py & tail -f nohup.out
|
||||
@@ -96,28 +134,44 @@ sudo docker logs -f chatgpt-on-wechat
|
||||
|
||||
## 核心配置项
|
||||
|
||||
```json
|
||||
{
|
||||
"channel_type": "web",
|
||||
"model": "MiniMax-M2.5",
|
||||
"agent": true,
|
||||
"agent_workspace": "~/cow",
|
||||
"agent_max_context_tokens": 40000,
|
||||
"agent_max_context_turns": 30,
|
||||
"agent_max_steps": 15
|
||||
}
|
||||
```
|
||||
<Tabs>
|
||||
<Tab title="源码部署(config.json)">
|
||||
```json
|
||||
{
|
||||
"channel_type": "web",
|
||||
"model": "MiniMax-M2.7",
|
||||
"agent": true,
|
||||
"agent_workspace": "~/cow",
|
||||
"agent_max_context_tokens": 40000,
|
||||
"agent_max_context_turns": 30,
|
||||
"agent_max_steps": 15
|
||||
}
|
||||
```
|
||||
</Tab>
|
||||
<Tab title="Docker 部署(docker-compose.yml)">
|
||||
```yaml
|
||||
environment:
|
||||
CHANNEL_TYPE: 'web'
|
||||
MODEL: 'MiniMax-M2.7'
|
||||
MINIMAX_API_KEY: 'your-api-key'
|
||||
AGENT: 'True'
|
||||
AGENT_MAX_CONTEXT_TOKENS: 40000
|
||||
AGENT_MAX_CONTEXT_TURNS: 30
|
||||
AGENT_MAX_STEPS: 15
|
||||
```
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
| 参数 | 说明 | 默认值 |
|
||||
| --- | --- | --- |
|
||||
| `channel_type` | 接入渠道类型 | `web` |
|
||||
| `model` | 模型名称 | `MiniMax-M2.5` |
|
||||
| `agent` | 是否启用 Agent 模式 | `true` |
|
||||
| `agent_workspace` | Agent 工作空间路径 | `~/cow` |
|
||||
| `agent_max_context_tokens` | 最大上下文 tokens | `40000` |
|
||||
| `agent_max_context_turns` | 最大上下文记忆轮次 | `30` |
|
||||
| `agent_max_steps` | 单次任务最大决策步数 | `15` |
|
||||
| 参数 | 环境变量 | 说明 | 默认值 |
|
||||
| --- | --- | --- | --- |
|
||||
| `channel_type` | `CHANNEL_TYPE` | 接入渠道类型 | `web` |
|
||||
| `model` | `MODEL` | 模型名称 | `MiniMax-M2.5` |
|
||||
| `agent` | `AGENT` | 是否启用 Agent 模式 | `true` |
|
||||
| `agent_workspace` | - | Agent 工作空间路径 | `~/cow` |
|
||||
| `agent_max_context_tokens` | `AGENT_MAX_CONTEXT_TOKENS` | 最大上下文 tokens | `40000` |
|
||||
| `agent_max_context_turns` | `AGENT_MAX_CONTEXT_TURNS` | 最大上下文记忆轮次 | `30` |
|
||||
| `agent_max_steps` | `AGENT_MAX_STEPS` | 单次任务最大决策步数 | `15` |
|
||||
|
||||
<Tip>
|
||||
全部配置项可在项目 [`config.py`](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/config.py) 文件中查看。
|
||||
全部配置项可在项目 [`config.py`](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/config.py) 文件中查看。Docker 部署时,配置项名称需转为大写环境变量格式。
|
||||
</Tip>
|
||||
|
||||
@@ -9,16 +9,25 @@ description: 使用脚本一键安装和管理 CowAgent
|
||||
|
||||
## 安装命令
|
||||
|
||||
```bash
|
||||
bash <(curl -fsSL https://cdn.link-ai.tech/code/cow/run.sh)
|
||||
```
|
||||
<Tabs>
|
||||
<Tab title="Linux / macOS">
|
||||
```bash
|
||||
bash <(curl -fsSL https://cdn.link-ai.tech/code/cow/run.sh)
|
||||
```
|
||||
</Tab>
|
||||
<Tab title="Windows (PowerShell)">
|
||||
```powershell
|
||||
irm https://cdn.link-ai.tech/code/cow/run.ps1 | iex
|
||||
```
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
脚本自动执行以下流程:
|
||||
|
||||
1. 检查 Python 环境(需要 Python 3.7+)
|
||||
2. 安装必要工具(git、curl 等)
|
||||
3. 克隆项目代码到 `~/chatgpt-on-wechat`
|
||||
4. 安装 Python 依赖
|
||||
4. 安装 Python 依赖和 Cow CLI
|
||||
5. 引导配置 AI 模型和通信渠道
|
||||
6. 启动服务
|
||||
|
||||
@@ -26,14 +35,20 @@ bash <(curl -fsSL https://cdn.link-ai.tech/code/cow/run.sh)
|
||||
|
||||
## 管理命令
|
||||
|
||||
安装完成后,可使用以下命令管理服务:
|
||||
安装完成后,使用 `cow` CLI 管理服务:
|
||||
|
||||
| 命令 | 说明 |
|
||||
| --- | --- |
|
||||
| `./run.sh start` | 启动服务 |
|
||||
| `./run.sh stop` | 停止服务 |
|
||||
| `./run.sh restart` | 重启服务 |
|
||||
| `./run.sh status` | 查看运行状态 |
|
||||
| `./run.sh logs` | 查看实时日志 |
|
||||
| `./run.sh config` | 重新配置 |
|
||||
| `./run.sh update` | 更新项目代码 |
|
||||
| `cow start` | 启动服务 |
|
||||
| `cow stop` | 停止服务 |
|
||||
| `cow restart` | 重启服务 |
|
||||
| `cow status` | 查看运行状态 |
|
||||
| `cow logs` | 查看实时日志 |
|
||||
| `cow update` | 更新代码并重启 |
|
||||
| `cow install-browser` | 安装浏览器工具依赖 |
|
||||
|
||||
更多命令和用法参考 [命令文档](/commands/index)。
|
||||
|
||||
<Note>
|
||||
如果 `cow` 命令不可用,也可以使用 `./run.sh <命令>`(Linux/macOS)或 `.\scripts\run.ps1 <命令>`(Windows)作为替代,功能等效。
|
||||
</Note>
|
||||
|
||||
@@ -3,20 +3,25 @@ title: 更新升级
|
||||
description: CowAgent 的升级方式说明
|
||||
---
|
||||
|
||||
## 脚本升级(推荐)
|
||||
## 命令升级(推荐)
|
||||
|
||||
如果使用 `run.sh` 管理服务,在项目根目录执行以下命令即可一键升级:
|
||||
使用 `cow update` 一键完成代码更新和服务重启:
|
||||
|
||||
```bash
|
||||
./run.sh update
|
||||
cow update
|
||||
```
|
||||
|
||||
该命令会自动完成以下流程:
|
||||
|
||||
1. 停止当前运行的服务
|
||||
2. 拉取最新代码
|
||||
3. 重新检查依赖
|
||||
4. 启动服务
|
||||
1. 拉取最新代码(`git pull`)
|
||||
2. 停止当前服务
|
||||
3. 更新 Python 依赖
|
||||
4. 重新安装 CLI
|
||||
5. 启动服务
|
||||
|
||||
<Note>
|
||||
如果未安装 Cow CLI,也可以使用 `./run.sh update` 完成相同操作。
|
||||
</Note>
|
||||
|
||||
## 手动升级
|
||||
|
||||
@@ -25,15 +30,19 @@ description: CowAgent 的升级方式说明
|
||||
```bash
|
||||
git pull
|
||||
pip3 install -r requirements.txt
|
||||
pip3 install -e .
|
||||
```
|
||||
|
||||
更新完成后重启服务:
|
||||
|
||||
```bash
|
||||
# 如果使用 run.sh 管理
|
||||
# 使用 Cow CLI
|
||||
cow restart
|
||||
|
||||
# 或使用 run.sh
|
||||
./run.sh restart
|
||||
|
||||
# 如果使用 nohup 直接运行
|
||||
# 或使用 nohup 直接运行
|
||||
kill $(ps -ef | grep app.py | grep -v grep | awk '{print $2}')
|
||||
nohup python3 app.py & tail -f nohup.out
|
||||
```
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
---
|
||||
title: 功能介绍
|
||||
description: CowAgent 长期记忆、任务规划、技能系统详细说明
|
||||
description: CowAgent 长期记忆、任务规划、技能系统、CLI 命令、浏览器工具详细说明
|
||||
---
|
||||
|
||||
## 1. 长期记忆
|
||||
@@ -19,7 +19,7 @@ description: CowAgent 长期记忆、任务规划、技能系统详细说明
|
||||
|
||||
工具是 Agent 访问操作系统资源的核心,Agent 会根据任务需求智能选择和调用工具,完成文件读写、命令执行、定时任务等各类操作。内置工具的实现在项目的 `agent/tools/` 目录下。
|
||||
|
||||
**主要工具:** 文件读写编辑、Bash 终端、文件发送、定时调度、记忆搜索、联网搜索、环境配置等。
|
||||
**主要工具:** 文件读写编辑、Bash 终端、浏览器操作、文件发送、定时调度、记忆搜索、联网搜索、环境配置等。
|
||||
|
||||
### 2.1 终端和文件访问
|
||||
|
||||
@@ -45,7 +45,15 @@ description: CowAgent 长期记忆、任务规划、技能系统详细说明
|
||||
<img src="https://cdn.link-ai.tech/doc/20260202195402.png" width="800" />
|
||||
</Frame>
|
||||
|
||||
### 2.4 环境变量管理
|
||||
### 2.4 浏览器操作
|
||||
|
||||
内置 `browser` 工具,Agent 可控制浏览器访问网页、填写表单、点击元素、截图,支持动态 JS 渲染页面。运行 `cow install-browser` 一键安装,自动适配服务器(无头模式)和桌面环境:
|
||||
|
||||
<Frame>
|
||||
<img src="https://cdn.link-ai.tech/doc/20260401115728.png" width="750" />
|
||||
</Frame>
|
||||
|
||||
### 2.5 环境变量管理
|
||||
|
||||
技能所需的秘钥存储在环境变量文件中,由 `env_config` 工具进行管理,你可以通过对话的方式更新秘钥,工具内置安全保护和脱敏策略:
|
||||
|
||||
@@ -57,9 +65,12 @@ description: CowAgent 长期记忆、任务规划、技能系统详细说明
|
||||
|
||||
技能系统为 Agent 提供无限的扩展性,每个 Skill 由说明文件、运行脚本(可选)、资源(可选)组成,描述如何完成特定类型的任务。通过 Skill 可以让 Agent 遵循说明完成复杂流程、调用各类工具或对接第三方系统。
|
||||
|
||||
- **[Skill Hub](https://skills.cowagent.ai/):** 开放的技能广场,汇集官方推荐、社区贡献和第三方技能,支持一键安装。
|
||||
- **内置技能:** 在项目的 `skills/` 目录下,包含技能创造器、图像识别、LinkAI 智能体、网页抓取等。内置 Skill 根据依赖条件(API Key、系统命令等)自动判断是否启用。
|
||||
- **自定义技能:** 由用户通过对话创建,存放在工作空间中(`~/cow/skills/`),可实现任何复杂的业务流程和第三方系统对接。
|
||||
|
||||
安装技能:`/skill install <名称>` 或 `cow skill install <名称>`,支持从 Skill Hub、GitHub、ClawHub、URL 等来源安装。
|
||||
|
||||
### 3.1 创建技能
|
||||
|
||||
通过 `skill-creator` 技能可以通过对话的方式快速创建技能。你可以让 Agent 将某个工作流程固化为技能,或者把任意接口文档和示例发送给 Agent,让他直接完成对接:
|
||||
@@ -77,29 +88,36 @@ description: CowAgent 长期记忆、任务规划、技能系统详细说明
|
||||
<img src="https://cdn.link-ai.tech/doc/20260202213219.png" width="800" />
|
||||
</Frame>
|
||||
|
||||
### 3.3 三方知识库和插件
|
||||
### 3.3 技能广场
|
||||
|
||||
`linkai-agent` 技能可以将 [LinkAI](https://link-ai.tech/) 上的所有智能体作为 Skill 交给 Agent 使用,实现多智能体决策效果。
|
||||
访问 [skills.cowagent.ai](https://skills.cowagent.ai/) 浏览所有可用技能,或在对话中执行:
|
||||
|
||||
配置方式:通过 `env_config` 配置 `LINKAI_API_KEY`,并在 `skills/linkai-agent/config.json` 中添加智能体说明:
|
||||
|
||||
```json
|
||||
{
|
||||
"apps": [
|
||||
{
|
||||
"app_code": "G7z6vKwp",
|
||||
"app_name": "LinkAI客服助手",
|
||||
"app_description": "当用户需要了解LinkAI平台相关问题时才选择该助手"
|
||||
},
|
||||
{
|
||||
"app_code": "SFY5x7JR",
|
||||
"app_name": "内容创作助手",
|
||||
"app_description": "当用户需要创作图片或视频时才使用该助手"
|
||||
}
|
||||
]
|
||||
}
|
||||
```text
|
||||
/skill list --remote # 浏览技能广场
|
||||
/skill search <关键词> # 搜索技能
|
||||
/skill install <名称> # 一键安装
|
||||
```
|
||||
|
||||
<Frame>
|
||||
<img src="https://cdn.link-ai.tech/doc/20260202234350.png" width="750" />
|
||||
</Frame>
|
||||
同时还支持安装Github、ClawHub、LinkAI等第三方平台上的所有技能,详情查看 [技能安装](/skills/install)
|
||||
|
||||
<img src="https://cdn.link-ai.tech/doc/20260401110103.png" width="750" />
|
||||
|
||||
|
||||
## 4. CLI 命令系统
|
||||
|
||||
CowAgent 提供两种命令交互方式,覆盖服务管理、技能安装、配置调整等日常运维操作:
|
||||
|
||||
- **终端 CLI:** 在系统终端执行 `cow <命令>`,支持 `start`、`stop`、`restart`、`update`、`status`、`logs`、`skill` 等
|
||||
- **对话命令:** 在对话中输入 `/<命令>`,Web 控制台输入 `/` 可弹出指令菜单快速选择
|
||||
|
||||
```bash
|
||||
cow start # 启动服务
|
||||
cow stop # 停止服务
|
||||
cow update # 更新并重启
|
||||
cow skill install pptx # 安装技能
|
||||
cow install-browser # 安装浏览器工具
|
||||
```
|
||||
|
||||
详细命令参考 [命令总览](https://docs.cowagent.ai/commands)。
|
||||
|
||||
<img src="https://cdn.link-ai.tech/doc/20260401114549.png" width="750" />
|
||||
|
||||
@@ -22,7 +22,7 @@ CowAgent 支持灵活切换多种模型,能处理文本、语音、图片、
|
||||
|
||||
<CardGroup cols={2}>
|
||||
<Card title="复杂任务规划" icon="brain" href="/intro/architecture">
|
||||
能够理解复杂任务并自主规划执行,持续思考和调用工具直到完成目标,支持通过工具操作访问文件、终端、浏览器、定时任务等系统资源。
|
||||
能够理解复杂任务并自主规划执行,持续思考和调用各类工具和技能直到完成目标。
|
||||
</Card>
|
||||
<Card title="长期记忆" icon="database" href="/memory">
|
||||
自动将对话记忆持久化至本地文件和数据库中,包括全局记忆和天级记忆,支持关键词及向量检索。
|
||||
@@ -33,10 +33,16 @@ CowAgent 支持灵活切换多种模型,能处理文本、语音、图片、
|
||||
<Card title="多模态消息" icon="image" href="/channels/web">
|
||||
支持对文本、图片、语音、文件等多类型消息进行解析、处理、生成、发送等操作。
|
||||
</Card>
|
||||
<Card title="多模型接入" icon="microchip" href="/models/index">
|
||||
<Card title="工具系统" icon="wrench" href="/tools/index">
|
||||
内置文件读写、终端执行、浏览器操作、定时任务、消息发送等工具,Agent 可自主调用工具完成复杂任务。
|
||||
</Card>
|
||||
<Card title="命令系统" icon="terminal" href="/commands/index">
|
||||
提供终端 CLI 和对话中的命令,支持进程管理、技能安装、配置修改、上下文查看等常用操作。
|
||||
</Card>
|
||||
<Card title="多模型支持" icon="microchip" href="/models/index">
|
||||
支持 OpenAI, Claude, Gemini, DeepSeek, MiniMax, GLM, Qwen, Kimi, Doubao 等国内外主流模型厂商。
|
||||
</Card>
|
||||
<Card title="多端部署" icon="server" href="/channels/weixin">
|
||||
<Card title="多通道接入" icon="server" href="/channels/weixin">
|
||||
支持运行在本地计算机或服务器,可集成到微信、网页、飞书、钉钉、微信公众号、企业微信应用中使用。
|
||||
</Card>
|
||||
</CardGroup>
|
||||
@@ -45,9 +51,18 @@ CowAgent 支持灵活切换多种模型,能处理文本、语音、图片、
|
||||
|
||||
在终端执行以下命令,即可一键安装、配置、启动 CowAgent:
|
||||
|
||||
```bash
|
||||
bash <(curl -fsSL https://cdn.link-ai.tech/code/cow/run.sh)
|
||||
```
|
||||
<Tabs>
|
||||
<Tab title="Linux / macOS">
|
||||
```bash
|
||||
bash <(curl -fsSL https://cdn.link-ai.tech/code/cow/run.sh)
|
||||
```
|
||||
</Tab>
|
||||
<Tab title="Windows (PowerShell)">
|
||||
```powershell
|
||||
irm https://cdn.link-ai.tech/code/cow/run.ps1 | iex
|
||||
```
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
运行后默认会启动 Web 控制台,通过访问 `http://localhost:9899` 可以在网页端进行对话、配置、应用通道接入等操作。
|
||||
|
||||
|
||||
@@ -13,6 +13,7 @@
|
||||
<a href="https://cowagent.ai/">🌐 ウェブサイト</a> ·
|
||||
<a href="https://docs.cowagent.ai/en/intro/index">📖 ドキュメント</a> ·
|
||||
<a href="https://docs.cowagent.ai/en/guide/quick-start">🚀 クイックスタート</a> ·
|
||||
<a href="https://skills.cowagent.ai/">🧩 Skill Hub</a> ·
|
||||
<a href="https://link-ai.tech/cowagent/create">☁️ オンラインで試す</a>
|
||||
</p>
|
||||
|
||||
@@ -20,13 +21,14 @@
|
||||
|
||||
> CowAgentは、すぐに使えるAIスーパーアシスタントであると同時に、高い拡張性を持つAgentフレームワークでもあります。新しいモデルインターフェース、チャネル、組み込みツール、Skillシステムを拡張することで、さまざまなカスタマイズニーズに柔軟に対応できます。
|
||||
|
||||
- ✅ **自律的タスク計画**: 複雑なタスクを理解し、自律的に実行計画を立て、目標達成までツールを呼び出しながら継続的に思考します。ツールを通じてファイル、ターミナル、ブラウザ、スケジューラなどのシステムリソースにアクセスできます。
|
||||
- ✅ **自律的タスク計画**: 複雑なタスクを理解し、自律的に実行計画を立て、目標達成までツールを呼び出しながら継続的に思考します。
|
||||
- ✅ **長期記憶**: 会話の記憶をローカルファイルやデータベースに自動的に永続化します。コアメモリとデイリーメモリを含み、キーワード検索やベクトル検索に対応しています。
|
||||
- ✅ **Skillシステム**: Skillの作成・実行エンジンを実装しており、複数の組み込みSkillを備え、自然言語での会話を通じたカスタムSkillの開発もサポートしています。
|
||||
- ✅ **Skillシステム**: Skillの作成・実行エンジンを実装。[Skill Hub](https://skills.cowagent.ai)、GitHubなどからSkillをインストールでき、会話を通じたカスタムSkill作成もサポートしています。
|
||||
- ✅ **ツールシステム**: ファイル読み書き、ターミナル実行、ブラウザ操作、スケジュールタスク、メッセージ送信などの組み込みツールを提供。Agentが自律的に呼び出して複雑なタスクを完了します。
|
||||
- ✅ **CLIシステム**: ターミナルコマンドとチャットコマンドを提供し、プロセス管理、Skillインストール、設定変更などの操作をサポートします。
|
||||
- ✅ **マルチモーダルメッセージ**: テキスト、画像、音声、ファイルなど、さまざまなメッセージタイプの解析・処理・生成・送信に対応しています。
|
||||
- ✅ **複数モデル対応**: OpenAI、Claude、Gemini、DeepSeek、MiniMax、GLM、Qwen、Kimi、Doubaoなど、主要なモデルプロバイダーに対応しています。
|
||||
- ✅ **マルチプラットフォームデプロイ**: ローカルPCやサーバー上で実行でき、WeChat、Web、Feishu、DingTalk、WeChat公式アカウント、WeComアプリケーションに統合可能です。
|
||||
- ✅ **ナレッジベース**: [LinkAI](https://link-ai.tech) プラットフォームを通じて、企業向けナレッジベース機能を統合できます。
|
||||
|
||||
## 免責事項
|
||||
|
||||
@@ -40,6 +42,8 @@
|
||||
|
||||
## 更新履歴
|
||||
|
||||
> **2026.04.01:** [v2.0.5](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/2.0.5) — Cow CLI、Skill Hubオープンソース化、ブラウザツール、WeCom Botスキャン作成など。
|
||||
|
||||
> **2026.02.27:** [v2.0.2](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/2.0.2) — Webコンソールの全面刷新(ストリーミングチャット、モデル/Skill/メモリ/チャネル/スケジューラ/ログ管理)、マルチチャネル同時実行、セッション永続化、Gemini 3.1 Pro / Claude 4.6 Sonnet / Qwen3.5 Plusなど新モデル追加。
|
||||
|
||||
> **2026.02.13:** [v2.0.1](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/2.0.1) — 組み込みWeb検索ツール、スマートコンテキストトリミング、ランタイム情報の動的更新、Windows互換性、スケジューラのメモリ喪失やFeishu接続問題などの修正。
|
||||
@@ -60,13 +64,19 @@
|
||||
|
||||
本プロジェクトは、インストール・設定・起動・管理をワンクリックで行えるスクリプトを提供しています:
|
||||
|
||||
**Linux / macOS:**
|
||||
```bash
|
||||
bash <(curl -fsSL https://cdn.link-ai.tech/code/cow/run.sh)
|
||||
```
|
||||
|
||||
**Windows (PowerShell):**
|
||||
```powershell
|
||||
irm https://cdn.link-ai.tech/code/cow/run.ps1 | iex
|
||||
```
|
||||
|
||||
実行後、デフォルトでWebサービスが起動します。`http://localhost:9899/chat` にアクセスしてチャットを開始できます。
|
||||
|
||||
スクリプトの使い方: [ワンクリックインストール](https://docs.cowagent.ai/en/guide/quick-start)
|
||||
スクリプトの使い方: [ワンクリックインストール](https://docs.cowagent.ai/ja/guide/quick-start)。インストール後は `cow start`、`cow stop` などの [CLI コマンド](https://docs.cowagent.ai/ja/commands/index)でサービスを管理できます。
|
||||
|
||||
### 手動インストール
|
||||
|
||||
@@ -84,7 +94,25 @@ pip3 install -r requirements.txt
|
||||
pip3 install -r requirements-optional.txt # 任意ですが推奨
|
||||
```
|
||||
|
||||
**3. 設定**
|
||||
**3. Cow CLI のインストール(推奨)**
|
||||
|
||||
```bash
|
||||
pip3 install -e .
|
||||
```
|
||||
|
||||
インストール後、`cow` コマンドでサービス管理(起動、停止、更新など)やSkill管理ができます。[コマンドドキュメント](https://docs.cowagent.ai/ja/commands/index)を参照してください。
|
||||
|
||||
**4. ブラウザのインストール(任意)**
|
||||
|
||||
Agentにブラウザ操作(Webページへのアクセス、フォーム入力など)が必要な場合:
|
||||
|
||||
```bash
|
||||
cow install-browser
|
||||
```
|
||||
|
||||
`playwright` と Chromium を自動インストールします。[ブラウザツールドキュメント](https://docs.cowagent.ai/ja/tools/browser)を参照してください。
|
||||
|
||||
**5. 設定**
|
||||
|
||||
```bash
|
||||
cp config-template.json config.json
|
||||
@@ -92,13 +120,25 @@ cp config-template.json config.json
|
||||
|
||||
`config.json` にモデルのAPIキーとチャネルタイプを記入してください。詳細は[設定ドキュメント](https://docs.cowagent.ai/en/guide/manual-install)を参照してください。
|
||||
|
||||
**4. 実行**
|
||||
**6. 実行**
|
||||
|
||||
```bash
|
||||
python3 app.py
|
||||
cow start # 推奨、Cow CLI が必要
|
||||
python3 app.py # または直接実行
|
||||
```
|
||||
|
||||
サーバーでバックグラウンド実行する場合:
|
||||
サーバーデプロイでは、`cow` コマンドでサービスを管理できます:
|
||||
|
||||
```bash
|
||||
cow start # バックグラウンドで起動
|
||||
cow stop # サービス停止
|
||||
cow restart # サービス再起動
|
||||
cow status # 実行状態を確認
|
||||
cow logs # ログを表示
|
||||
cow update # 最新コードを取得して再起動
|
||||
```
|
||||
|
||||
または従来の方法で実行:
|
||||
|
||||
```bash
|
||||
nohup python3 app.py & tail -f nohup.out
|
||||
@@ -186,6 +226,7 @@ Coding Planは各プロバイダーが提供する月額サブスクリプショ
|
||||
|
||||
## 🔗 関連プロジェクト
|
||||
|
||||
- [Cow Skill Hub](https://github.com/zhayujie/cow-skill-hub): AIエージェント向けのオープンSkillマーケットプレイス。CowAgent、OpenClaw、Claude Codeなどで利用可能なSkillの閲覧・検索・インストール・公開が可能。
|
||||
- [bot-on-anything](https://github.com/zhayujie/bot-on-anything): 軽量で高い拡張性を持つLLMアプリケーションフレームワーク。Slack、Telegram、Discord、Gmailなどに対応。
|
||||
- [AgentMesh](https://github.com/MinimalFuture/AgentMesh): エージェントチームの協調による複雑な問題解決のためのオープンソースのマルチエージェントフレームワーク。
|
||||
|
||||
@@ -195,7 +236,7 @@ FAQ: <https://github.com/zhayujie/chatgpt-on-wechat/wiki/FAQs>
|
||||
|
||||
## 🛠️ コントリビューション
|
||||
|
||||
新しいチャネルの追加を歓迎します。[Feishuチャネル](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/channel/feishu/feishu_channel.py)を参考にしてください。また、新しいSkillのコントリビューションも歓迎します。[Skill Creatorドキュメント](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/skills/skill-creator/SKILL.md)を参照してください。
|
||||
新しいチャネルの追加を歓迎します。[Feishuチャネル](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/channel/feishu/feishu_channel.py)を参考にしてください。また、新しいSkillのコントリビューションも歓迎します。[Skill作成ドキュメント](https://docs.cowagent.ai/ja/skills/create)を参照するか、[Skill Hub](https://skills.cowagent.ai/submit)に提出してください。
|
||||
|
||||
## ✉ お問い合わせ
|
||||
|
||||
|
||||
101
docs/ja/commands/general.mdx
Normal file
101
docs/ja/commands/general.mdx
Normal file
@@ -0,0 +1,101 @@
|
||||
---
|
||||
title: 汎用コマンド
|
||||
description: ステータスの確認、設定管理、コンテキスト制御などのよく使うコマンド
|
||||
---
|
||||
|
||||
以下のコマンドはチャットで `/` プレフィックス、ターミナルで `cow` プレフィックスで使用できます(一部はチャット専用)。
|
||||
|
||||
<Tip>
|
||||
Web コンソールでは `/` を入力すると自動補完メニューが表示され、キーボードのナビゲーションと Tab 補完に対応しています。
|
||||
</Tip>
|
||||
|
||||
## help
|
||||
|
||||
使用可能なすべてのコマンドのヘルプ情報を表示します。
|
||||
|
||||
```text
|
||||
/help
|
||||
```
|
||||
|
||||
## status
|
||||
|
||||
現在のセッションとサービスの実行状態を表示します。プロセス情報、モデル設定、メッセージ数、読み込み済みスキル数を含みます。
|
||||
|
||||
```text
|
||||
/status
|
||||
```
|
||||
|
||||
## config
|
||||
|
||||
実行時設定の表示または変更を行います。変更は即座に反映され、再起動は不要です。
|
||||
|
||||
**すべての設定項目を表示:**
|
||||
|
||||
```text
|
||||
/config
|
||||
```
|
||||
|
||||
**単一の設定項目を表示:**
|
||||
|
||||
```text
|
||||
/config model
|
||||
```
|
||||
|
||||
**設定項目を変更:**
|
||||
|
||||
```text
|
||||
/config model deepseek-chat
|
||||
```
|
||||
|
||||
**変更可能な設定項目:**
|
||||
|
||||
| 項目 | 説明 | 例 |
|
||||
| --- | --- | --- |
|
||||
| `model` | AI モデル名 | `deepseek-chat` |
|
||||
| `agent_max_context_tokens` | 最大コンテキストトークン数 | `40000` |
|
||||
| `agent_max_context_turns` | 最大コンテキスト記憶ターン数 | `30` |
|
||||
| `agent_max_steps` | タスクごとの最大判断ステップ数 | `15` |
|
||||
|
||||
<Note>
|
||||
`model` を変更すると、システムが対応するモデル API を自動的にマッチングします。設定は `config.json` に永続的に保存されます。
|
||||
</Note>
|
||||
|
||||
## context
|
||||
|
||||
現在のセッションのコンテキスト統計情報を表示します。メッセージ数やコンテンツの長さを含みます。
|
||||
|
||||
```text
|
||||
/context
|
||||
```
|
||||
|
||||
**現在のセッションのコンテキストをクリア:**
|
||||
|
||||
```text
|
||||
/context clear
|
||||
```
|
||||
|
||||
<Tip>
|
||||
コンテキストをクリアすると、Agent は以前の会話内容を「忘れます」。話題の切り替えやコンテキストスペースの解放に便利です。
|
||||
</Tip>
|
||||
|
||||
## logs
|
||||
|
||||
最近のサービスログを表示します。デフォルトでは最近の 20 行を表示し、最大 50 行です。
|
||||
|
||||
```text
|
||||
/logs
|
||||
```
|
||||
|
||||
**行数を指定:**
|
||||
|
||||
```text
|
||||
/logs 50
|
||||
```
|
||||
|
||||
## version
|
||||
|
||||
現在の CowAgent のバージョンを表示します。
|
||||
|
||||
```text
|
||||
/version
|
||||
```
|
||||
84
docs/ja/commands/index.mdx
Normal file
84
docs/ja/commands/index.mdx
Normal file
@@ -0,0 +1,84 @@
|
||||
---
|
||||
title: コマンド概要
|
||||
description: CowAgent コマンドシステム — ターミナル CLI とチャットコマンド
|
||||
---
|
||||
|
||||
CowAgent は2つのコマンド操作方法を提供しています:
|
||||
|
||||
- **ターミナル CLI** — システムターミナルで `cow <コマンド>` を実行し、サービス管理やスキル管理を行います
|
||||
- **チャットコマンド** — 会話で `/<コマンド>` または `cow <コマンド>` を入力し、ステータス確認、スキル管理、設定変更を行います
|
||||
|
||||
## Cow CLI
|
||||
|
||||
ワンクリックインストールスクリプトでデプロイすると、`cow` コマンドが自動的に利用可能になります。手動インストールの場合は以下を実行してください:
|
||||
|
||||
```bash
|
||||
pip install -e .
|
||||
```
|
||||
|
||||
インストール後、任意の場所で `cow` コマンドを使用できます:
|
||||
|
||||
```bash
|
||||
cow help
|
||||
```
|
||||
|
||||
出力例:
|
||||
|
||||
```
|
||||
🐮 CowAgent CLI
|
||||
|
||||
Usage: cow <command>
|
||||
|
||||
Service:
|
||||
start Start the CowAgent service
|
||||
stop Stop the CowAgent service
|
||||
restart Restart the CowAgent service
|
||||
update Update code and restart service
|
||||
status Show service status
|
||||
logs View service logs
|
||||
|
||||
Skills:
|
||||
skill Manage skills (list / search / install / uninstall ...)
|
||||
|
||||
Others:
|
||||
help Show this help message
|
||||
version Show version
|
||||
```
|
||||
|
||||
## チャットコマンド
|
||||
|
||||
Web コンソールや接続されたチャネルの会話で `/` を入力すると、コマンドの候補が表示されます。使用可能なコマンド:
|
||||
|
||||
| コマンド | 説明 |
|
||||
| --- | --- |
|
||||
| `/help` | コマンドヘルプを表示 |
|
||||
| `/status` | サービスの状態と設定を表示 |
|
||||
| `/config` | 実行時設定の表示・変更 |
|
||||
| `/skill` | スキル管理(インストール、アンインストール、有効化、無効化など) |
|
||||
| `/context` | 現在のセッションのコンテキスト情報を表示 |
|
||||
| `/context clear` | 現在のセッションのコンテキストをクリア |
|
||||
| `/logs` | 最近のログを表示 |
|
||||
| `/version` | バージョン番号を表示 |
|
||||
|
||||
<Tip>
|
||||
`/start`、`/stop`、`/restart` などのサービス管理コマンドは、プロセス操作を伴うため、ターミナルでの使用を案内します。
|
||||
</Tip>
|
||||
|
||||
## コマンド対応表
|
||||
|
||||
| コマンド | ターミナル (`cow`) | チャット (`/`) |
|
||||
| --- | :---: | :---: |
|
||||
| help | ✓ | ✓ |
|
||||
| version | ✓ | ✓ |
|
||||
| status | ✓ | ✓ |
|
||||
| logs | ✓ | ✓ |
|
||||
| config | ✗ | ✓ |
|
||||
| context | — | ✓ |
|
||||
| skill(サブコマンド) | ✓ | ✓ |
|
||||
| start / stop / restart | ✓ | ✗ |
|
||||
| update | ✓ | ✗ |
|
||||
| install-browser | ✓ | ✗ |
|
||||
|
||||
<Note>
|
||||
`context` はターミナルではチャットでの使用を案内するのみです。`config` はチャットでのみ利用可能です。
|
||||
</Note>
|
||||
123
docs/ja/commands/process.mdx
Normal file
123
docs/ja/commands/process.mdx
Normal file
@@ -0,0 +1,123 @@
|
||||
---
|
||||
title: プロセス管理
|
||||
description: cow コマンドで CowAgent プロセスのライフサイクルを管理
|
||||
---
|
||||
|
||||
プロセス管理コマンドは CowAgent バックグラウンドプロセスのライフサイクルを制御します。これらのコマンドはターミナルでのみ使用可能です。
|
||||
|
||||
## start
|
||||
|
||||
CowAgent サービスを起動します。デフォルトではバックグラウンドデーモンとして実行され、自動的にログを表示します。
|
||||
|
||||
```bash
|
||||
cow start
|
||||
```
|
||||
|
||||
**オプション:**
|
||||
|
||||
| オプション | 説明 |
|
||||
| --- | --- |
|
||||
| `-f`, `--foreground` | フォアグラウンドで実行(デーモンとして起動しない) |
|
||||
| `--no-logs` | 起動後にログを自動表示しない |
|
||||
|
||||
## stop
|
||||
|
||||
実行中の CowAgent サービスを停止します。
|
||||
|
||||
```bash
|
||||
cow stop
|
||||
```
|
||||
|
||||
## restart
|
||||
|
||||
CowAgent サービスを再起動します(停止してから起動)。
|
||||
|
||||
```bash
|
||||
cow restart
|
||||
```
|
||||
|
||||
**オプション:**
|
||||
|
||||
| オプション | 説明 |
|
||||
| --- | --- |
|
||||
| `--no-logs` | 再起動後にログを自動表示しない |
|
||||
|
||||
## update
|
||||
|
||||
コードを更新してサービスを再起動します。自動的に以下を実行します:
|
||||
|
||||
1. 最新コードをプル(`git pull`)
|
||||
2. 現在のサービスを停止
|
||||
3. Python 依存パッケージを更新
|
||||
4. CLI を再インストール
|
||||
5. サービスを起動
|
||||
|
||||
```bash
|
||||
cow update
|
||||
```
|
||||
|
||||
<Warning>
|
||||
`git pull` が失敗した場合(ローカルの未コミットの変更がある場合など)、更新は中止され、サービスには影響しません。
|
||||
</Warning>
|
||||
|
||||
## status
|
||||
|
||||
CowAgent サービスの実行状態を確認します。プロセス情報、バージョン、現在のモデルとチャネルの設定を含みます。
|
||||
|
||||
```bash
|
||||
cow status
|
||||
```
|
||||
|
||||
## logs
|
||||
|
||||
サービスログを表示します。
|
||||
|
||||
```bash
|
||||
cow logs
|
||||
```
|
||||
|
||||
**オプション:**
|
||||
|
||||
| オプション | 説明 | デフォルト値 |
|
||||
| --- | --- | --- |
|
||||
| `-f`, `--follow` | ログ出力を継続的に追跡 | いいえ |
|
||||
| `-n`, `--lines` | 最近の N 行を表示 | 50 |
|
||||
|
||||
例:
|
||||
|
||||
```bash
|
||||
# 最近の100行を表示
|
||||
cow logs -n 100
|
||||
|
||||
# ログを継続的に追跡
|
||||
cow logs -f
|
||||
```
|
||||
|
||||
## install-browser
|
||||
|
||||
[ブラウザツール](/ja/tools/browser)のために Playwright と Chromium ブラウザをインストールします。
|
||||
|
||||
```bash
|
||||
cow install-browser
|
||||
```
|
||||
|
||||
<Tip>
|
||||
ブラウザツール(Web ブラウジング、スクリーンショットなど)を使用する場合にのみ必要です。
|
||||
</Tip>
|
||||
|
||||
## run.sh との互換性
|
||||
|
||||
Cow CLI がインストールされていない場合は、`run.sh` でサービスを管理できます:
|
||||
|
||||
| cow コマンド | run.sh 相当 |
|
||||
| --- | --- |
|
||||
| `cow start` | `./run.sh start` |
|
||||
| `cow stop` | `./run.sh stop` |
|
||||
| `cow restart` | `./run.sh restart` |
|
||||
| `cow update` | `./run.sh update` |
|
||||
| `cow status` | `./run.sh status` |
|
||||
| `cow logs` | `./run.sh logs` |
|
||||
|
||||
<Note>
|
||||
`cow` コマンドの使用を推奨します。よりシンプルな構文と豊富な機能を提供します。ワンクリックインストールスクリプトで自動的にインストールされます。
|
||||
</Note>
|
||||
192
docs/ja/commands/skill.mdx
Normal file
192
docs/ja/commands/skill.mdx
Normal file
@@ -0,0 +1,192 @@
|
||||
---
|
||||
title: スキル管理
|
||||
description: コマンドでスキルのインストール、アンインストール、有効化、無効化、管理を行う
|
||||
---
|
||||
|
||||
スキル管理コマンドは CowAgent のスキルのインストール、検索、管理に使用します。チャットでは `/skill <サブコマンド>`、ターミナルでは `cow skill <サブコマンド>` を使用します。
|
||||
|
||||
## list
|
||||
|
||||
インストール済みスキルとその状態を一覧表示します。
|
||||
|
||||
<CodeGroup>
|
||||
```text チャット
|
||||
/skill list
|
||||
```
|
||||
|
||||
```bash ターミナル
|
||||
cow skill list
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
**スキル広場を閲覧**(利用可能なすべてのスキルを表示):
|
||||
|
||||
<CodeGroup>
|
||||
```text チャット
|
||||
/skill list --remote
|
||||
```
|
||||
|
||||
```bash ターミナル
|
||||
cow skill list --remote
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
**オプション:**
|
||||
|
||||
| オプション | 説明 | デフォルト値 |
|
||||
| --- | --- | --- |
|
||||
| `--remote`, `-r` | Skill Hub のリモートスキルリストを閲覧 | いいえ |
|
||||
| `--page` | リモートリストのページ番号 | 1 |
|
||||
|
||||
## search
|
||||
|
||||
スキル広場でスキルを検索します。
|
||||
|
||||
<CodeGroup>
|
||||
```text チャット
|
||||
/skill search pptx
|
||||
```
|
||||
|
||||
```bash ターミナル
|
||||
cow skill search pptx
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
## install
|
||||
|
||||
統一された `install` コマンドで、Cow スキル広場、GitHub、ClawHub、任意の URL(zip アーカイブ、SKILL.md リンク)からスキルをワンクリックでインストールできます。手動ダウンロードや設定は不要です。
|
||||
|
||||
**スキル広場からインストール(推奨):**
|
||||
|
||||
<CodeGroup>
|
||||
```text チャット
|
||||
/skill install pptx
|
||||
```
|
||||
|
||||
```bash ターミナル
|
||||
cow skill install pptx
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
**GitHub からインストール:**
|
||||
|
||||
<CodeGroup>
|
||||
```text チャット
|
||||
# リポジトリ内のすべてのスキルをインストール(SKILL.md を含むサブディレクトリを自動検出)
|
||||
/skill install larksuite/cli
|
||||
|
||||
# サブディレクトリを指定して単一スキルをインストール
|
||||
/skill install https://github.com/larksuite/cli/tree/main/skills/lark-im
|
||||
|
||||
# # でサブディレクトリを指定
|
||||
/skill install larksuite/cli#skills/lark-minutes
|
||||
```
|
||||
|
||||
```bash ターミナル
|
||||
# リポジトリ内のすべてのスキルをインストール(SKILL.md を含むサブディレクトリを自動検出)
|
||||
cow skill install larksuite/cli
|
||||
|
||||
# サブディレクトリを指定して単一スキルをインストール
|
||||
cow skill install https://github.com/larksuite/cli/tree/main/skills/lark-im
|
||||
|
||||
# # でサブディレクトリを指定
|
||||
cow skill install larksuite/cli#skills/lark-minutes
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
完全な GitHub URL と `owner/repo` 省略形に対応しています。モノリポ(1つのリポジトリに複数のスキル)の場合、サブディレクトリを省略するとすべてのスキルを自動検出して一括インストールします。サブディレクトリを指定した場合は、そのスキルのみをインストールします。
|
||||
|
||||
**ClawHub からインストール:**
|
||||
|
||||
<CodeGroup>
|
||||
```text チャット
|
||||
/skill install clawhub:baidu-search
|
||||
```
|
||||
|
||||
```bash ターミナル
|
||||
cow skill install clawhub:baidu-search
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
**URL からインストール:**
|
||||
|
||||
<CodeGroup>
|
||||
```text チャット
|
||||
# zip アーカイブからインストール(単一またはバッチ)
|
||||
/skill install https://cdn.link-ai.tech/skills/pptx.zip
|
||||
|
||||
# SKILL.md リンクからインストール
|
||||
/skill install https://example.com/path/to/SKILL.md
|
||||
```
|
||||
|
||||
```bash ターミナル
|
||||
# zip アーカイブからインストール(単一またはバッチ)
|
||||
cow skill install https://cdn.link-ai.tech/skills/pptx.zip
|
||||
|
||||
# SKILL.md リンクからインストール
|
||||
cow skill install https://example.com/path/to/SKILL.md
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
zip / tar.gz アーカイブ URL からのインストールに対応しており、自動的に解凍して `SKILL.md` を含むディレクトリを検出し、単一またはバッチインストールをサポートします。`SKILL.md` ファイルの URL から直接インストールすることもでき、スキル名と説明を自動的に解析します。
|
||||
|
||||
## uninstall
|
||||
|
||||
インストール済みスキルをアンインストールします。
|
||||
|
||||
<CodeGroup>
|
||||
```text チャット
|
||||
/skill uninstall pptx
|
||||
```
|
||||
|
||||
```bash ターミナル
|
||||
cow skill uninstall pptx
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
<Warning>
|
||||
アンインストールするとスキルディレクトリ内のすべてのファイルが削除されます。この操作は元に戻せません。
|
||||
</Warning>
|
||||
|
||||
## enable / disable
|
||||
|
||||
スキルの有効化・無効化を行います。無効化されたスキルは Agent から呼び出されません。
|
||||
|
||||
<CodeGroup>
|
||||
```text チャット
|
||||
/skill enable pptx
|
||||
/skill disable pptx
|
||||
```
|
||||
|
||||
```bash ターミナル
|
||||
cow skill enable pptx
|
||||
cow skill disable pptx
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
## info
|
||||
|
||||
インストール済みスキルの詳細情報を表示します。`SKILL.md` のプレビューを含みます。
|
||||
|
||||
<CodeGroup>
|
||||
```text チャット
|
||||
/skill info pptx
|
||||
```
|
||||
|
||||
```bash ターミナル
|
||||
cow skill info pptx
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
## スキルのソース
|
||||
|
||||
インストールされたスキルはソース情報を記録しており、`/skill list` で確認できます:
|
||||
|
||||
| ソース | 説明 |
|
||||
| --- | --- |
|
||||
| `builtin` | プロジェクト内蔵スキル |
|
||||
| `cowhub` | CowAgent Skill Hub からインストール |
|
||||
| `github` | GitHub URL から直接インストール |
|
||||
| `clawhub` | ClawHub からインストール |
|
||||
| `url` | SKILL.md URL からインストール |
|
||||
| `local` | ローカルで作成されたスキル |
|
||||
@@ -30,7 +30,25 @@ pip3 install -r requirements.txt
|
||||
pip3 install -r requirements-optional.txt
|
||||
```
|
||||
|
||||
### 3. 設定
|
||||
### 3. Cow CLI をインストール
|
||||
|
||||
サービスとスキルを管理するためのコマンドラインツールをインストールします:
|
||||
|
||||
```bash
|
||||
pip3 install -e .
|
||||
```
|
||||
|
||||
インストール後、`cow` コマンドが使用可能になります:
|
||||
|
||||
```bash
|
||||
cow help
|
||||
```
|
||||
|
||||
<Note>
|
||||
このステップは推奨です。インストール後、`cow start`、`cow stop`、`cow update` でサービスを管理でき、`cow skill` でスキルを管理できます。CLI をインストールしない場合は、`./run.sh` または `python3 app.py` で実行できます。
|
||||
</Note>
|
||||
|
||||
### 4. 設定
|
||||
|
||||
設定テンプレートをコピーして編集します:
|
||||
|
||||
@@ -40,22 +58,32 @@ cp config-template.json config.json
|
||||
|
||||
`config.json` にモデルの API キー、チャネルタイプ、その他の設定を入力します。詳細は[モデルのドキュメント](/ja/models/index)を参照してください。
|
||||
|
||||
### 4. 実行
|
||||
### 5. 実行
|
||||
|
||||
**ローカルで実行:**
|
||||
**Cow CLI を使用して実行(推奨):**
|
||||
|
||||
```bash
|
||||
cow start
|
||||
```
|
||||
|
||||
**またはローカルでフォアグラウンド実行:**
|
||||
|
||||
```bash
|
||||
python3 app.py
|
||||
```
|
||||
|
||||
デフォルトではWebサービスが起動します。`http://localhost:9899/chat` にアクセスしてチャットできます。
|
||||
デフォルトでは Web コンソールが起動します。`http://localhost:9899` にアクセスしてチャットできます。
|
||||
|
||||
**サーバーでバックグラウンド実行:**
|
||||
**サーバーでバックグラウンド実行(CLI 未使用時):**
|
||||
|
||||
```bash
|
||||
nohup python3 app.py & tail -f nohup.out
|
||||
```
|
||||
|
||||
<Tip>
|
||||
サーバーにデプロイする場合は、ファイアウォールまたはセキュリティグループでポート `9899` を開放して Web コンソールにアクセスできるようにしてください。セキュリティのため、特定の IP のみにアクセスを制限することを推奨します。
|
||||
</Tip>
|
||||
|
||||
## Docker によるデプロイ
|
||||
|
||||
Docker デプロイでは、ソースコードのクローンや依存パッケージのインストールは不要です。Agent モードを使用する場合は、より広範なシステムアクセスが可能なソースコードによるデプロイを推奨します。
|
||||
@@ -84,6 +112,10 @@ sudo docker compose up -d
|
||||
sudo docker logs -f chatgpt-on-wechat
|
||||
```
|
||||
|
||||
<Tip>
|
||||
サーバーにデプロイする場合は、ファイアウォールまたはセキュリティグループでポート `9899` を開放して Web コンソールにアクセスできるようにしてください。セキュリティのため、特定の IP のみにアクセスを制限することを推奨します。
|
||||
</Tip>
|
||||
|
||||
## 主要な設定項目
|
||||
|
||||
```json
|
||||
|
||||
@@ -9,31 +9,46 @@ Linux、macOS、Windowsに対応しています。Python 3.7〜3.12が必要で
|
||||
|
||||
## インストールコマンド
|
||||
|
||||
```bash
|
||||
bash <(curl -fsSL https://cdn.link-ai.tech/code/cow/run.sh)
|
||||
```
|
||||
<Tabs>
|
||||
<Tab title="Linux / macOS">
|
||||
```bash
|
||||
bash <(curl -fsSL https://cdn.link-ai.tech/code/cow/run.sh)
|
||||
```
|
||||
</Tab>
|
||||
<Tab title="Windows (PowerShell)">
|
||||
```powershell
|
||||
irm https://cdn.link-ai.tech/code/cow/run.ps1 | iex
|
||||
```
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
スクリプトは以下の手順を自動的に実行します:
|
||||
|
||||
1. Python環境の確認(Python 3.7以上が必要)
|
||||
2. 必要なツールのインストール(git、curlなど)
|
||||
3. プロジェクトを `~/chatgpt-on-wechat` にクローン
|
||||
4. Pythonの依存パッケージをインストール
|
||||
4. Pythonの依存パッケージと Cow CLI をインストール
|
||||
5. AIモデルとチャネルの対話式設定
|
||||
6. サービスの起動
|
||||
|
||||
デフォルトでは、インストール後にWebサービスが起動します。`http://localhost:9899/chat` にアクセスしてチャットを開始できます。
|
||||
デフォルトでは、インストール後に Web コンソールが起動します。`http://localhost:9899` にアクセスしてチャットを開始できます。
|
||||
|
||||
## 管理コマンド
|
||||
|
||||
インストール後、以下のコマンドでサービスを管理できます:
|
||||
インストール後、`cow` コマンドでサービスを管理できます:
|
||||
|
||||
| コマンド | 説明 |
|
||||
| --- | --- |
|
||||
| `./run.sh start` | サービスを起動 |
|
||||
| `./run.sh stop` | サービスを停止 |
|
||||
| `./run.sh restart` | サービスを再起動 |
|
||||
| `./run.sh status` | 実行状態を確認 |
|
||||
| `./run.sh logs` | リアルタイムログを表示 |
|
||||
| `./run.sh config` | 再設定 |
|
||||
| `./run.sh update` | プロジェクトコードを更新 |
|
||||
| `cow start` | サービスを起動 |
|
||||
| `cow stop` | サービスを停止 |
|
||||
| `cow restart` | サービスを再起動 |
|
||||
| `cow status` | 実行状態を確認 |
|
||||
| `cow logs` | リアルタイムログを表示 |
|
||||
| `cow update` | コードを更新して再起動 |
|
||||
| `cow install-browser` | ブラウザツールの依存をインストール |
|
||||
|
||||
詳細は[コマンドドキュメント](/ja/commands/index)を参照してください。
|
||||
|
||||
<Note>
|
||||
`cow` コマンドが利用できない場合は、`./run.sh <コマンド>`(Linux/macOS)または `.\scripts\run.ps1 <コマンド>`(Windows)で代替できます。機能は同等です。
|
||||
</Note>
|
||||
|
||||
@@ -3,20 +3,25 @@ title: アップデート
|
||||
description: CowAgent のアップグレード方法
|
||||
---
|
||||
|
||||
## スクリプトによるアップグレード(推奨)
|
||||
## コマンドによるアップグレード(推奨)
|
||||
|
||||
`run.sh` でサービスを管理している場合、以下のコマンドでワンクリックアップグレードできます:
|
||||
`cow update` でコードの更新とサービスの再起動をワンクリックで実行できます:
|
||||
|
||||
```bash
|
||||
./run.sh update
|
||||
cow update
|
||||
```
|
||||
|
||||
このコマンドは以下のフローを自動的に実行します:
|
||||
|
||||
1. 現在実行中のサービスを停止
|
||||
2. 最新コードをプル
|
||||
3. 依存関係を再チェック
|
||||
4. サービスを起動
|
||||
1. 最新コードをプル(`git pull`)
|
||||
2. 現在のサービスを停止
|
||||
3. Python 依存パッケージを更新
|
||||
4. CLI を再インストール
|
||||
5. サービスを起動
|
||||
|
||||
<Note>
|
||||
Cow CLI がインストールされていない場合は、`./run.sh update` でも同様の操作が可能です。
|
||||
</Note>
|
||||
|
||||
## 手動アップグレード
|
||||
|
||||
@@ -25,15 +30,19 @@ description: CowAgent のアップグレード方法
|
||||
```bash
|
||||
git pull
|
||||
pip3 install -r requirements.txt
|
||||
pip3 install -e .
|
||||
```
|
||||
|
||||
更新完了後、サービスを再起動します:
|
||||
|
||||
```bash
|
||||
# run.sh で管理している場合
|
||||
# Cow CLI を使用
|
||||
cow restart
|
||||
|
||||
# または run.sh を使用
|
||||
./run.sh restart
|
||||
|
||||
# nohup で直接実行している場合
|
||||
# または nohup で直接実行
|
||||
kill $(ps -ef | grep app.py | grep -v grep | awk '{print $2}')
|
||||
nohup python3 app.py & tail -f nohup.out
|
||||
```
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
---
|
||||
title: 機能詳細
|
||||
description: CowAgent の長期記憶、タスク計画、Skill システムの詳細
|
||||
description: CowAgent の長期記憶、タスク計画、Skill システム、CLI コマンド、ブラウザツールの詳細
|
||||
---
|
||||
|
||||
## 1. 長期記憶
|
||||
@@ -19,7 +19,7 @@ description: CowAgent の長期記憶、タスク計画、Skill システムの
|
||||
|
||||
ツールは Agent がオペレーティングシステムのリソースにアクセスするための中核です。Agent はタスク要件に基づいてインテリジェントにツールを選択・呼び出し、ファイルの読み書き、コマンド実行、スケジュールタスクなどを実行します。組み込みツールはプロジェクトの `agent/tools/` ディレクトリに実装されています。
|
||||
|
||||
**主なツール:** ファイルの読み書き・編集、Bash ターミナル、ファイル送信、スケジューラ、記憶検索、Web 検索、環境設定など。
|
||||
**主なツール:** ファイルの読み書き・編集、Bash ターミナル、ブラウザ操作、ファイル送信、スケジューラ、記憶検索、Web 検索、環境設定など。
|
||||
|
||||
### 2.1 ターミナルとファイルアクセス
|
||||
|
||||
@@ -45,7 +45,15 @@ OS のターミナルとファイルシステムへのアクセスは、最も
|
||||
<img src="https://cdn.link-ai.tech/doc/20260202195402.png" width="800" />
|
||||
</Frame>
|
||||
|
||||
### 2.4 環境変数管理
|
||||
### 2.4 ブラウザ操作
|
||||
|
||||
組み込みの `browser` ツールにより、Agent は Chromium ブラウザを制御して Web ページへのアクセス、フォームの入力、要素のクリック、スクリーンショットの撮影が可能です。動的 JS レンダリングページにも対応しています。`cow install-browser` でワンコマンドインストール、サーバー(ヘッドレス)とデスクトップ環境に自動対応します:
|
||||
|
||||
<Frame>
|
||||
<img src="https://cdn.link-ai.tech/doc/20260401110103.png" width="800" />
|
||||
</Frame>
|
||||
|
||||
### 2.5 環境変数管理
|
||||
|
||||
Skill が必要とするシークレットキーは環境変数ファイルに保存され、`env_config` ツールによって管理されます。会話を通じてシークレットを更新でき、セキュリティ保護とマスキング機能が組み込まれています:
|
||||
|
||||
@@ -57,9 +65,12 @@ Skill が必要とするシークレットキーは環境変数ファイルに
|
||||
|
||||
Skill システムは Agent に無限の拡張性を提供します。各 Skill は説明ファイル、実行スクリプト(任意)、リソース(任意)で構成され、特定のタイプのタスクを完了する方法を記述します。Skill により Agent は複雑なワークフローの指示に従い、ツールを呼び出し、サードパーティシステムと連携できます。
|
||||
|
||||
- **[Skill Hub](https://skills.cowagent.ai/):** オープンな Skill マーケットプレイス。公式推奨、コミュニティ、サードパーティの Skill を収録。ワンコマンドでインストール可能。
|
||||
- **組み込み Skill:** プロジェクトの `skills/` ディレクトリにあり、Skill クリエイター、画像認識、LinkAI Agent、Web フェッチなどが含まれます。組み込み Skill は依存条件(API キー、システムコマンドなど)に基づいて自動的に有効化されます。
|
||||
- **カスタム Skill:** ユーザーが会話を通じて作成し、ワークスペース(`~/cow/skills/`)に保存されます。あらゆる複雑なビジネスプロセスやサードパーティ連携を実装できます。
|
||||
|
||||
Skill のインストール:`/skill install <名前>` または `cow skill install <名前>`。Skill Hub、GitHub、ClawHub、URL などからインストール可能。
|
||||
|
||||
### 3.1 Skill の作成
|
||||
|
||||
`skill-creator` Skill により、会話を通じて Skill を素早く作成できます。ワークフローを Skill としてコード化するよう Agent に依頼したり、API ドキュメントやサンプルを送信して Agent に直接連携を完成させることができます:
|
||||
@@ -77,29 +88,33 @@ Skill システムは Agent に無限の拡張性を提供します。各 Skill
|
||||
<img src="https://cdn.link-ai.tech/doc/20260202213219.png" width="800" />
|
||||
</Frame>
|
||||
|
||||
### 3.3 サードパーティナレッジベースとプラグイン
|
||||
### 3.3 Skill Hub
|
||||
|
||||
`linkai-agent` Skill により、[LinkAI](https://link-ai.tech/) 上のすべての Agent を Skill として利用でき、マルチ Agent による意思決定が可能になります。
|
||||
[skills.cowagent.ai](https://skills.cowagent.ai/) で利用可能なすべての Skill を閲覧するか、会話内でコマンドを実行できます:
|
||||
|
||||
設定方法:`env_config` で `LINKAI_API_KEY` を設定し、`skills/linkai-agent/config.json` に Agent の説明を追加します:
|
||||
|
||||
```json
|
||||
{
|
||||
"apps": [
|
||||
{
|
||||
"app_code": "G7z6vKwp",
|
||||
"app_name": "LinkAI Customer Support",
|
||||
"app_description": "Select only when the user needs help with LinkAI platform questions"
|
||||
},
|
||||
{
|
||||
"app_code": "SFY5x7JR",
|
||||
"app_name": "Content Creator",
|
||||
"app_description": "Use only when the user needs to create images or videos"
|
||||
}
|
||||
]
|
||||
}
|
||||
```text
|
||||
/skill list --remote # Skill Hub を閲覧
|
||||
/skill search <キーワード> # Skill を検索
|
||||
/skill install <名前> # ワンコマンドでインストール
|
||||
```
|
||||
|
||||
<Frame>
|
||||
<img src="https://cdn.link-ai.tech/doc/20260202234350.png" width="750" />
|
||||
</Frame>
|
||||
GitHub、ClawHub、LinkAI などサードパーティプラットフォームの Skill もインストール可能です。詳細は [Skill のインストール](/ja/skills/install) を参照してください。
|
||||
|
||||
<img src="https://cdn.link-ai.tech/doc/20260401110103.png" width="750" />
|
||||
|
||||
## 4. CLI コマンドシステム
|
||||
|
||||
CowAgent はサービス管理、Skill インストール、設定変更などをカバーする2つのコマンドインターフェースを提供します:
|
||||
|
||||
- **ターミナル CLI:** システムターミナルで `cow <コマンド>` を実行。`start`、`stop`、`restart`、`update`、`status`、`logs`、`skill` などをサポート。
|
||||
- **チャットコマンド:** 会話内で `/<コマンド>` を入力。Web コンソールでは `/` を入力するとコマンドメニューが表示されます。
|
||||
|
||||
```bash
|
||||
cow start # サービスを開始
|
||||
cow stop # サービスを停止
|
||||
cow update # 更新して再起動
|
||||
cow skill install pptx # Skill をインストール
|
||||
cow install-browser # ブラウザツールをインストール
|
||||
```
|
||||
|
||||
詳細は [コマンド一覧](https://docs.cowagent.ai/ja/commands) を参照してください。
|
||||
|
||||
@@ -28,6 +28,12 @@ CowAgent は自ら思考しタスクを計画し、コンピュータや外部
|
||||
<Card title="マルチモーダルメッセージ" icon="image" href="/ja/channels/web">
|
||||
テキスト、画像、音声、ファイルなどのメッセージタイプの解析、処理、生成、送信をサポートします。
|
||||
</Card>
|
||||
<Card title="ツールシステム" icon="wrench" href="/ja/tools/index">
|
||||
ファイル読み書き、ターミナル実行、ブラウザ操作、スケジュールタスク、メッセージ送信などの組み込みツールを提供。Agent が自律的にツールを呼び出して複雑なタスクを完了します。
|
||||
</Card>
|
||||
<Card title="コマンドシステム" icon="terminal" href="/ja/commands/index">
|
||||
ターミナル CLI とチャット内コマンドを提供し、プロセス管理、Skill インストール、設定変更、コンテキスト確認などの一般的な操作をサポートします。
|
||||
</Card>
|
||||
<Card title="複数モデル対応" icon="microchip" href="/ja/models/index">
|
||||
OpenAI、Claude、Gemini、DeepSeek、MiniMax、GLM、Qwen、Kimi、Doubao など、主要なモデルプロバイダーをサポートしています。
|
||||
</Card>
|
||||
@@ -40,9 +46,18 @@ CowAgent は自ら思考しタスクを計画し、コンピュータや外部
|
||||
|
||||
ターミナルで以下のコマンドを実行すると、ワンクリックでインストール、設定、起動ができます:
|
||||
|
||||
```bash
|
||||
bash <(curl -fsSL https://cdn.link-ai.tech/code/cow/run.sh)
|
||||
```
|
||||
<Tabs>
|
||||
<Tab title="Linux / macOS">
|
||||
```bash
|
||||
bash <(curl -fsSL https://cdn.link-ai.tech/code/cow/run.sh)
|
||||
```
|
||||
</Tab>
|
||||
<Tab title="Windows (PowerShell)">
|
||||
```powershell
|
||||
irm https://cdn.link-ai.tech/code/cow/run.ps1 | iex
|
||||
```
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
デフォルトでは実行後に Web サービスが起動します。`http://localhost:9899/chat` にアクセスして Web インターフェースでチャットできます。
|
||||
|
||||
|
||||
80
docs/ja/memory/context.mdx
Normal file
80
docs/ja/memory/context.mdx
Normal file
@@ -0,0 +1,80 @@
|
||||
---
|
||||
title: 短期記憶
|
||||
description: 会話コンテキスト — メッセージ管理、圧縮戦略、コンテキスト操作
|
||||
---
|
||||
|
||||
会話コンテキストは Agent の短期記憶であり、現在のセッション内のすべてのメッセージ(ユーザー入力、Agent の返信、ツール呼び出しと結果)を含みます。適切なコンテキスト管理は、Agent の推論品質とコスト制御にとって重要です。
|
||||
|
||||
## コンテキストの構造
|
||||
|
||||
各会話ターンは以下で構成されます:
|
||||
|
||||
```
|
||||
ユーザーメッセージ → Agent の思考 → ツール呼び出し → ツール結果 → ... → Agent の最終返信
|
||||
```
|
||||
|
||||
1 つのターンには複数のツール呼び出しが含まれる場合があります(`agent_max_steps` で制御)。すべてのツール呼び出しと結果は、圧縮またはトリミングされるまでコンテキストに保持されます。
|
||||
|
||||
## 主要な設定
|
||||
|
||||
| パラメータ | 説明 | デフォルト値 |
|
||||
| --- | --- | --- |
|
||||
| `agent_max_context_tokens` | コンテキストの最大トークン予算 | `50000` |
|
||||
| `agent_max_context_turns` | コンテキストの最大会話ターン数 | `20` |
|
||||
| `agent_max_steps` | ターンあたりの最大判断ステップ数(ツール呼び出し回数) | `15` |
|
||||
|
||||
`config.json` またはチャットの `/config` コマンドで設定できます。
|
||||
|
||||
## 圧縮戦略
|
||||
|
||||
コンテキストが制限を超えた場合、システムは自動的に圧縮を実行してスペースを解放します。このプロセスには複数の段階があります:
|
||||
|
||||
### 1. ツール結果の切り詰め
|
||||
|
||||
各判断ループの開始前に、過去のターンのツール呼び出し結果を確認します。**20,000 文字** を超えるツール結果は切り詰められ、先頭と末尾のみが保持されます。現在のターンの結果は影響を受けません。
|
||||
|
||||
### 2. ターンのトリミング
|
||||
|
||||
会話ターン数が `agent_max_context_turns` を超えた場合:
|
||||
|
||||
- **最も古い半分** の完全なターンがトリミングされます(ツール呼び出しチェーンの完全性を保証)
|
||||
- トリミングされたメッセージは LLM によって要約され、**日次記憶ファイルに書き込まれます**
|
||||
- 残りのターンはそのまま保持されます
|
||||
|
||||
### 3. トークン予算のトリミング
|
||||
|
||||
ターンのトリミング後、トークン数がまだ予算を超えている場合:
|
||||
|
||||
- **5 ターン未満の場合**:すべてのターンで**テキスト圧縮**を実行 — 各ターンは最初のユーザーテキストと最後の Agent 返信のみを保持し、中間のツール呼び出しチェーンを削除
|
||||
- **5 ターン以上の場合**:**前半のターン**を再度トリミングし、破棄されたコンテンツも記憶に書き込まれます
|
||||
|
||||
### 4. オーバーフロー緊急処理
|
||||
|
||||
モデル API がコンテキストオーバーフローエラーを返した場合:
|
||||
|
||||
1. 現在のすべてのメッセージを要約して記憶に書き込み
|
||||
2. 積極的なトリミングを適用(ツール結果は 10K 文字に制限、ユーザーテキストは 10K、最大 5 ターン)
|
||||
3. それでもオーバーフローする場合は、会話コンテキスト全体をクリア
|
||||
|
||||
## セッションの永続化
|
||||
|
||||
会話メッセージはローカルデータベースに永続化され、サービス再起動後に自動的に復元されます。復元戦略:
|
||||
|
||||
- 最近の **`max(3, max_context_turns / 6)`** ターンを復元
|
||||
- 各ターンの**ユーザーテキストと Agent の最終返信のみ**を保持し、中間のツール呼び出しチェーンは復元しません
|
||||
- **30 日** を超える過去のセッションは自動的にクリーンアップされます
|
||||
|
||||
## 操作コマンド
|
||||
|
||||
チャットで以下のコマンドを使用してコンテキストを管理できます:
|
||||
|
||||
| コマンド | 説明 |
|
||||
| --- | --- |
|
||||
| `/context` | 現在のコンテキスト統計を表示(メッセージ数、ロール分布、合計文字数) |
|
||||
| `/context clear` | 現在のセッションコンテキストをクリア |
|
||||
| `/config agent_max_context_tokens 80000` | コンテキストトークン予算を調整 |
|
||||
| `/config agent_max_context_turns 30` | コンテキストターン上限を調整 |
|
||||
|
||||
<Tip>
|
||||
コンテキストをクリアすると、Agent は以前の会話内容を「忘れます」。すでに長期記憶に書き込まれたコンテンツは、記憶検索を通じて引き続き取得できます。
|
||||
</Tip>
|
||||
@@ -1,25 +1,25 @@
|
||||
---
|
||||
title: 記憶
|
||||
description: CowAgent 長期記憶システム
|
||||
title: 長期記憶
|
||||
description: CowAgent の長期記憶システム — ファイル永続化、自動書き込み、ハイブリッド検索
|
||||
---
|
||||
|
||||
記憶システムにより、Agent は重要な情報を長期にわたって記憶し、継続的に経験を蓄積し、ユーザーの好みを理解し、真に自律的な思考と継続的な成長を実現できます。
|
||||
長期記憶はワークスペースのファイルに保存され、セッション間で永続化されます。Agent は会話中に検索ツールを通じて過去の記憶をオンデマンドで読み込み、コンテキストのトリミング時に会話の要約を自動的に長期記憶に書き込みます。
|
||||
|
||||
## 記憶の種類
|
||||
|
||||
### コア記憶 (MEMORY.md)
|
||||
### コア記憶(MEMORY.md)
|
||||
|
||||
`~/cow/MEMORY.md` に保存され、長期的なユーザーの好み、重要な決定、主要な事実など、時間が経っても薄れない情報を含みます。毎回の会話ターンでバックグラウンド知識としてシステムプロンプトに自動的に注入されます。
|
||||
`~/cow/MEMORY.md` に保存され、長期的なユーザーの好み、重要な決定、主要な事実など、時間が経っても薄れない情報を含みます。Agent はツールを通じてこのファイルを読み書きし、長期的な知識を維持します。
|
||||
|
||||
### 日次記憶 (memory/YYYY-MM-DD.md)
|
||||
### 日次記憶(memory/YYYY-MM-DD.md)
|
||||
|
||||
`~/cow/memory/` ディレクトリに保存され、日付で命名されます(例:`2026-03-08.md`)。日々の会話の要約と主要なイベントを記録します。空ファイルの生成を避けるため、最初の書き込み時にのみファイルが作成されます。
|
||||
|
||||
## 記憶の書き込み
|
||||
## 自動書き込み
|
||||
|
||||
Agent は以下のメカニズムにより、会話内容を日次記憶に自動的に永続化します:
|
||||
Agent は以下のメカニズムにより、会話内容を長期記憶に自動的に永続化します:
|
||||
|
||||
- **コンテキストトリミング時** — 会話ターン数またはトークン数が設定上限を超えた場合、コンテキストの古い半分が一括でトリミングされ、破棄されたコンテンツは LLM によって要約されて重要な情報として日次記憶ファイルに書き込まれます
|
||||
- **コンテキストトリミング時** — 会話ターン数またはトークン数が設定上限を超えた場合、最も古い半分のコンテキストがトリミングされ、LLM によって要約されて日次記憶ファイルに書き込まれます
|
||||
- **毎日のスケジュール要約** — 毎日 23:55 に自動的にフル要約がトリガーされ、アクティビティが少ない日でも記憶が保存されます(内容が変更されていない場合はスキップ)
|
||||
- **API コンテキストオーバーフロー時** — モデル API がコンテキストオーバーフローエラーを返した場合、緊急措置として現在の会話要約が保存されます
|
||||
|
||||
@@ -40,27 +40,10 @@ Agent は以下のメカニズムにより、会話内容を日次記憶に自
|
||||
<img src="https://cdn.link-ai.tech/doc/20260203000455.png" width="800" />
|
||||
</Frame>
|
||||
|
||||
## 記憶の検索
|
||||
|
||||
記憶システムはハイブリッド検索モードをサポートしています:
|
||||
|
||||
- **キーワード検索** — キーワードに基づいて過去の記憶をマッチング
|
||||
- **ベクトル検索** — セマンティック類似性検索により、異なる表現でも関連する記憶を発見
|
||||
|
||||
Agent は必要に応じて会話中に自動的に記憶検索をトリガーし、関連する過去の情報をコンテキストに組み込みます。コア記憶(`MEMORY.md`)は常にシステムプロンプトに注入され、日次記憶は検索を通じてオンデマンドで読み込まれます。
|
||||
|
||||
## 設定
|
||||
|
||||
```json
|
||||
{
|
||||
"agent_workspace": "~/cow",
|
||||
"agent_max_context_tokens": 40000,
|
||||
"agent_max_context_turns": 20
|
||||
}
|
||||
```
|
||||
|
||||
| パラメータ | 説明 | デフォルト |
|
||||
| --- | --- | --- |
|
||||
| `agent_workspace` | ワークスペースパス、記憶ファイルはこのディレクトリ配下に保存されます | `~/cow` |
|
||||
| `agent_max_context_tokens` | 最大コンテキストトークン数。超過時に半分がトリミングされ、記憶として要約されます | `40000` |
|
||||
| `agent_max_context_turns` | 最大コンテキストターン数。超過時に半分がトリミングされ、記憶として要約されます | `20` |
|
||||
| `agent_max_context_tokens` | 最大コンテキストトークン数。超過時にトリミングされ、記憶として要約されます | `50000` |
|
||||
| `agent_max_context_turns` | 最大コンテキストターン数。超過時にトリミングされ、記憶として要約されます | `20` |
|
||||
@@ -5,6 +5,7 @@ description: CowAgent バージョン履歴
|
||||
|
||||
| バージョン | 日付 | 説明 |
|
||||
| --- | --- | --- |
|
||||
| [2.0.5](/ja/releases/v2.0.5) | 2026.04.01 | Cow CLI、Skill Hub オープンソース、ブラウザツール、企業微信スキャン作成、その他改善 |
|
||||
| [2.0.4](/ja/releases/v2.0.4) | 2026.03.22 | 個人WeChatチャネル追加、新モデルサポート、日本語ドキュメント、スクリプトリファクタリングおよび複数修正 |
|
||||
| [2.0.2](/ja/releases/v2.0.2) | 2026.02.27 | Web Console アップグレード、マルチチャネル同時実行、セッション永続化 |
|
||||
| [2.0.1](/en/releases/v2.0.1) | 2026.02.27 | 組み込み Web Search ツール、スマートコンテキスト管理、複数の修正 |
|
||||
|
||||
77
docs/ja/releases/v2.0.5.mdx
Normal file
77
docs/ja/releases/v2.0.5.mdx
Normal file
@@ -0,0 +1,77 @@
|
||||
---
|
||||
title: v2.0.5
|
||||
description: CowAgent 2.0.5 - Cow CLI、Skill Hub オープンソース、ブラウザツール、企業微信スキャン作成、その他改善
|
||||
---
|
||||
|
||||
## 🖥️ Cow CLI コマンドシステム
|
||||
|
||||
ターミナルと会話の両方で CowAgent を管理する新しい CLI コマンドシステム:
|
||||
|
||||
- **ターミナルコマンド**:`cow <コマンド>` で `start`、`stop`、`restart`、`update`、`status`、`logs` などを実行
|
||||
- **チャットコマンド**:会話で `/<コマンド>` を入力して `/help`、`/status`、`/config`、`/skill`、`/context`、`/logs`、`/version` など
|
||||
- **Web コンソール**:入力欄で `/` を入力するとスラッシュコマンドメニューが表示、矢印キーで入力履歴を辿れる
|
||||
- **Windows サポート**:PowerShell スクリプト `scripts/run.ps1` を追加、`cow` コマンドに対応
|
||||
|
||||
ドキュメント:[コマンド一覧](https://docs.cowagent.ai/ja/commands)
|
||||
|
||||
<img src="https://cdn.link-ai.tech/doc/20260401114549.png" width="750" />
|
||||
|
||||
## 🧩 Cow Skill Hub オープンソース
|
||||
|
||||
[Cow Skill Hub](https://skills.cowagent.ai)(スキル広場)がオープンソースとして公開。AI Agent スキルの閲覧、検索、インストール、公開が可能:
|
||||
|
||||
- **ワンコマンドインストール**:会話で `/skill install <名前>` またはターミナルで `cow skill install <名前>`
|
||||
- **マルチソース**:Skill Hub、GitHub、ClawHub、LinkAI などからインストール可能
|
||||
- **検索**:`/skill search` と `/skill list --remote` でスキル広場を閲覧・検索
|
||||
- **スキル公開**:[skills.cowagent.ai/submit](https://skills.cowagent.ai/submit) で自作スキルを提出
|
||||
- **ミラー加速**:中国国内向けミラーダウンロード対応
|
||||
|
||||
オープンソースリポジトリ:[cow-skill-hub](https://github.com/zhayujie/cow-skill-hub)
|
||||
|
||||
ドキュメント:[スキル広場](https://docs.cowagent.ai/ja/skills/hub)、[スキルのインストール](https://docs.cowagent.ai/ja/skills/install)
|
||||
|
||||
<img src="https://cdn.link-ai.tech/doc/20260401110103.png" width="750" />
|
||||
|
||||
## 🌐 ブラウザツール
|
||||
|
||||
新しい Browser ツール — Agent が Chromium ブラウザを制御して Web ページにアクセス・操作:
|
||||
|
||||
- **ナビゲーションと操作**:`navigate`、`click`、`fill`、`select`、`scroll`、`press` など
|
||||
- **ページスナップショット**:精簡 DOM スナップショットで Agent がページ構造を効率的に理解、ナビゲーション後に自動スナップショット
|
||||
- **スクリーンショット**:ワークスペースにページのスクリーンショットを保存
|
||||
- **JavaScript 実行**:ページでカスタムスクリプトを実行
|
||||
- **CLI インストール**:`cow install-browser` でワンコマンドセットアップ
|
||||
- **Docker サポート**:Docker イメージにブラウザインストール組み込み
|
||||
|
||||
ドキュメント:[ブラウザツール](https://docs.cowagent.ai/ja/tools/browser)
|
||||
|
||||
<img src="https://cdn.link-ai.tech/doc/20260401115728.png" width="750" />
|
||||
|
||||
## 🤖 企業微信 Bot スキャン作成
|
||||
|
||||
企業微信 Bot チャネルで QR コードスキャンによるワンクリック作成をサポート:
|
||||
|
||||
- **Web コンソールでスキャン**:「スキャン接入」モードを選択し、企業微信でスキャンするとボットが自動作成・接続
|
||||
- **手動モード**:既存の Bot ID と Secret を手動入力する方式も引き続きサポート
|
||||
- **ストリーム配信最適化**:WebSocket 混雑を避けるためのスロットリング
|
||||
|
||||
ドキュメント:[企業微信 Bot](https://docs.cowagent.ai/ja/channels/wecom-bot)
|
||||
|
||||
PR:[#2735](https://github.com/zhayujie/chatgpt-on-wechat/pull/2735)。Thanks [@WecomTeam](https://github.com/WecomTeam)
|
||||
|
||||
## 🐛 その他の改善と修正
|
||||
|
||||
- **DeepSeek モジュール**:独立 DeepSeek Bot、`deepseek_api_key` 専用設定対応([#2719](https://github.com/zhayujie/chatgpt-on-wechat/pull/2719))。Thanks [@6vision](https://github.com/6vision)
|
||||
- **Web コンソール**:スラッシュコマンドメニュー、入力履歴、新モデル選択肢、モバイル最適化([#2731](https://github.com/zhayujie/chatgpt-on-wechat/pull/2731))。Thanks [@zkjqd](https://github.com/zkjqd)
|
||||
- **コンテキスト**:トリミング後のコンテキスト喪失を修正([393f0c0](https://github.com/zhayujie/chatgpt-on-wechat/commit/393f0c0))
|
||||
- **システムプロンプト**:毎ターン再構築されない問題を修正([13f5fde](https://github.com/zhayujie/chatgpt-on-wechat/commit/13f5fde))
|
||||
- **Gemini**:GoogleGeminiBot の model 属性欠落を修正([#2716](https://github.com/zhayujie/chatgpt-on-wechat/pull/2716))。Thanks [@cowagent](https://github.com/cowagent)
|
||||
- **WeChat チャネル**:ファイル送信失敗・ファイル名消失の修正([6d9b7ba](https://github.com/zhayujie/chatgpt-on-wechat/commit/6d9b7ba)、[45faa9c](https://github.com/zhayujie/chatgpt-on-wechat/commit/45faa9c))
|
||||
- **Docker**:ボリューム権限修正、イメージサイズ削減([3eb8348](https://github.com/zhayujie/chatgpt-on-wechat/commit/3eb8348)、[4470d4c](https://github.com/zhayujie/chatgpt-on-wechat/commit/4470d4c))
|
||||
- **セキュリティ**:Memory Content パストラバーサルリスクを修正。Thanks [@August829](https://github.com/August829)
|
||||
|
||||
## 📦 アップグレード
|
||||
|
||||
`cow update` または `./run.sh update` でアップグレード、またはコードを手動で pull して再起動。詳細は[アップグレードガイド](https://docs.cowagent.ai/ja/guide/upgrade)を参照。
|
||||
|
||||
**リリース日**:2026.04.01 | [Full Changelog](https://github.com/zhayujie/chatgpt-on-wechat/compare/2.0.4...master)
|
||||
58
docs/ja/skills/create.mdx
Normal file
58
docs/ja/skills/create.mdx
Normal file
@@ -0,0 +1,58 @@
|
||||
---
|
||||
title: スキルの作成
|
||||
description: 会話を通じてカスタムスキルを作成
|
||||
---
|
||||
|
||||
CowAgent には Skill Creator が組み込まれており、自然言語の会話を通じてスキルの作成、インストール、更新を素早く行えます。
|
||||
|
||||
## 使い方
|
||||
|
||||
会話で作りたいスキルを説明するだけで、Agent が自動的に作成します:
|
||||
|
||||
- ワークフローをスキル化:「このデプロイプロセスからスキルを作成して」
|
||||
- サードパーティ API の統合:「この API ドキュメントに基づいてスキルを作成して」
|
||||
- リモートスキルのインストール:「xxx スキルをインストールして」
|
||||
|
||||
## 作成フロー
|
||||
|
||||
1. 作成したいスキルを Agent に伝えます
|
||||
2. Agent が自動的に `SKILL.md` の説明と実行スクリプトを生成します
|
||||
3. スキルはワークスペースの `~/cow/skills/` ディレクトリに保存されます
|
||||
4. 以降の会話で Agent が自動的にそのスキルを認識し使用します
|
||||
|
||||
<Frame>
|
||||
<img src="https://cdn.link-ai.tech/doc/20260202202247.png" width="800" />
|
||||
</Frame>
|
||||
|
||||
## SKILL.md のフォーマット
|
||||
|
||||
作成されたスキルは標準の SKILL.md フォーマットに従います:
|
||||
|
||||
```markdown
|
||||
---
|
||||
name: my-skill
|
||||
description: Brief description of the skill
|
||||
metadata:
|
||||
emoji: 🔧
|
||||
requires:
|
||||
bins: ["curl"]
|
||||
env: ["MY_API_KEY"]
|
||||
primaryEnv: "MY_API_KEY"
|
||||
---
|
||||
|
||||
# My Skill
|
||||
|
||||
Detailed instructions...
|
||||
```
|
||||
|
||||
| フィールド | 説明 |
|
||||
| --- | --- |
|
||||
| `name` | スキル名。ディレクトリ名と一致する必要があります |
|
||||
| `description` | スキルの説明。Agent はこれに基づいて呼び出すかどうかを判断します |
|
||||
| `metadata.requires.bins` | 必要なシステムコマンド |
|
||||
| `metadata.requires.env` | 必要な環境変数 |
|
||||
| `metadata.always` | 常に読み込む(デフォルトは false) |
|
||||
|
||||
<Tip>
|
||||
詳細は [Skill Creator のドキュメント](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/skills/skill-creator/SKILL.md)をご覧ください。
|
||||
</Tip>
|
||||
@@ -1,31 +0,0 @@
|
||||
---
|
||||
title: Image Vision
|
||||
description: OpenAI の Vision モデルを使用して画像を認識
|
||||
---
|
||||
|
||||
OpenAI の GPT-4 Vision API を使用して画像の内容を分析し、画像内のオブジェクト、テキスト、色などの要素を理解します。
|
||||
|
||||
## 依存関係
|
||||
|
||||
| 依存関係 | 説明 |
|
||||
| --- | --- |
|
||||
| `OPENAI_API_KEY` | OpenAI API キー |
|
||||
| `curl`, `base64` | システムコマンド(通常プリインストール済み) |
|
||||
|
||||
設定方法:
|
||||
|
||||
- `env_config` Tool で `OPENAI_API_KEY` を設定
|
||||
- または `config.json` で `open_ai_api_key` を設定
|
||||
|
||||
## 対応モデル
|
||||
|
||||
- `gpt-4.1-mini`(推奨、コストパフォーマンスに優れる)
|
||||
- `gpt-4.1`
|
||||
|
||||
## 使い方
|
||||
|
||||
設定が完了したら、Agent に画像を送信すると自動的に画像認識がトリガーされます。
|
||||
|
||||
<Frame>
|
||||
<img src="https://cdn.link-ai.tech/doc/20260202213219.png" width="800" />
|
||||
</Frame>
|
||||
@@ -1,35 +1,32 @@
|
||||
---
|
||||
title: Skill 概要
|
||||
description: CowAgent の Skill システム紹介
|
||||
title: スキル概要
|
||||
description: CowAgent のスキルシステム紹介
|
||||
---
|
||||
|
||||
Skill は Agent に無限の拡張性を提供します。各 Skill は説明ファイル(`SKILL.md`)、実行スクリプト(任意)、リソース(任意)で構成され、特定のタスクをどのように遂行するかを記述します。
|
||||
スキル(Skill)は Agent に無限の拡張性を提供します。各スキルは説明ファイル(`SKILL.md`)、実行スクリプト(任意)、リソース(任意)で構成され、特定のタスクをどのように遂行するかを記述します。
|
||||
|
||||
Skill と Tool の違い:Tool はコードで実装された原子的な操作(例:ファイルの読み書き、コマンドの実行)であるのに対し、Skill は説明ファイルに基づく高レベルなワークフローであり、複数の Tool を組み合わせて複雑なタスクを完遂できます。
|
||||
スキルとツールの違い:ツールはコードで実装された原子的な操作(例:ファイルの読み書き、コマンドの実行)であるのに対し、スキルは説明ファイルに基づく高レベルなワークフローであり、複数のツールを組み合わせて複雑なタスクを完遂できます。
|
||||
|
||||
## 組み込み Skill
|
||||
## スキルの取得
|
||||
|
||||
プロジェクトの `skills/` ディレクトリに配置されており、依存条件に基づいて自動的に有効化されます:
|
||||
CowAgent ではスキルを取得する複数の方法を提供しています:
|
||||
|
||||
| Skill | 説明 | 依存関係 |
|
||||
| --- | --- | --- |
|
||||
| [`skill-creator`](/ja/skills/skill-creator) | 会話を通じてカスタム Skill を作成 | なし |
|
||||
| [`openai-image-vision`](/ja/skills/image-vision) | OpenAI の Vision モデルを使用して画像を認識 | `OPENAI_API_KEY` |
|
||||
| [`linkai-agent`](/ja/skills/linkai-agent) | LinkAI プラットフォームの Agent を統合 | `LINKAI_API_KEY` |
|
||||
| [`web-fetch`](/ja/skills/web-fetch) | Web ページのテキストコンテンツを取得 | `curl`(デフォルトで有効) |
|
||||
- **Cow スキル広場** — `/skill list --remote` でコミュニティスキルを閲覧・インストール
|
||||
- **GitHub** — GitHub リポジトリから直接インストール、バッチインストールにも対応
|
||||
- **ClawHub** — `/skill install clawhub:名前` で ClawHub のスキルをインストール
|
||||
- **URL** — zip アーカイブや SKILL.md リンクからインストール
|
||||
- **会話で作成** — 自然言語の会話を通じて Agent にスキルを自動作成させる
|
||||
|
||||
## カスタム Skill
|
||||
詳細は[スキルのインストール](/ja/skills/install)と[スキル管理コマンド](/ja/commands/skill)を参照してください。会話を通じて[スキルを作成](/ja/skills/create)することもできます。
|
||||
|
||||
ユーザーが会話を通じて作成し、ワークスペース(`~/cow/skills/`)に保存されます。任意の複雑なビジネスプロセスやサードパーティシステムとの連携を実装できます。
|
||||
## スキルの読み込み優先順位
|
||||
|
||||
## Skill の読み込み優先順位
|
||||
1. **ワークスペースのスキル**(最高優先):`~/cow/skills/`
|
||||
2. **プロジェクト組み込みスキル**(最低優先):`skills/`
|
||||
|
||||
1. **ワークスペースの Skill**(最高優先):`~/cow/skills/`
|
||||
2. **プロジェクト組み込み Skill**(最低優先):`skills/`
|
||||
同名のスキルは優先順位に従って上書きされます。
|
||||
|
||||
同名の Skill は優先順位に従って上書きされます。
|
||||
|
||||
## Skill のファイル構成
|
||||
## スキルのファイル構成
|
||||
|
||||
```
|
||||
skills/
|
||||
@@ -60,8 +57,8 @@ Detailed instructions...
|
||||
|
||||
| フィールド | 説明 |
|
||||
| --- | --- |
|
||||
| `name` | Skill 名。ディレクトリ名と一致する必要があります |
|
||||
| `description` | Skill の説明。Agent はこれに基づいて呼び出すかどうかを判断します |
|
||||
| `name` | スキル名。ディレクトリ名と一致する必要があります |
|
||||
| `description` | スキルの説明。Agent はこれに基づいて呼び出すかどうかを判断します |
|
||||
| `metadata.requires.bins` | 必要なシステムコマンド |
|
||||
| `metadata.requires.env` | 必要な環境変数 |
|
||||
| `metadata.always` | 常に読み込む(デフォルトは false) |
|
||||
|
||||
53
docs/ja/skills/install.mdx
Normal file
53
docs/ja/skills/install.mdx
Normal file
@@ -0,0 +1,53 @@
|
||||
---
|
||||
title: スキルのインストール
|
||||
description: 統一コマンドで多様なソースからスキルをインストール
|
||||
---
|
||||
|
||||
CowAgent は統一された `install` コマンドで、**Cow スキル広場、GitHub、ClawHub** および任意の URL からスキルをインストールできます。チャットでは `/skill install`、ターミナルでは `cow skill install` を使用します。
|
||||
|
||||
## スキル広場からインストール
|
||||
|
||||
スキル広場を閲覧してインストール:
|
||||
|
||||
```text
|
||||
/skill list --remote
|
||||
/skill install pptx
|
||||
```
|
||||
|
||||
## GitHub からインストール
|
||||
|
||||
リポジトリからの一括インストールとサブディレクトリ指定に対応:
|
||||
|
||||
```text
|
||||
/skill install larksuite/cli
|
||||
/skill install https://github.com/larksuite/cli/tree/main/skills/lark-im
|
||||
```
|
||||
|
||||
## ClawHub からインストール
|
||||
|
||||
```text
|
||||
/skill install clawhub:baidu-search
|
||||
```
|
||||
|
||||
## URL からインストール
|
||||
|
||||
zip アーカイブと SKILL.md ファイルリンクに対応:
|
||||
|
||||
```text
|
||||
/skill install https://cdn.link-ai.tech/skills/pptx.zip
|
||||
/skill install https://example.com/path/to/SKILL.md
|
||||
```
|
||||
|
||||
## スキルの管理
|
||||
|
||||
```text
|
||||
/skill list # インストール済みスキルを表示
|
||||
/skill info pptx # スキルの詳細を表示
|
||||
/skill enable pptx # スキルを有効化
|
||||
/skill disable pptx # スキルを無効化
|
||||
/skill uninstall pptx # スキルをアンインストール
|
||||
```
|
||||
|
||||
<Tip>
|
||||
上記のすべてのコマンドは、ターミナルでは `/skill` を `cow skill` に置き換えて使用できます。完全なコマンドドキュメントは[スキル管理コマンド](/ja/commands/skill)を参照してください。
|
||||
</Tip>
|
||||
@@ -1,47 +0,0 @@
|
||||
---
|
||||
title: LinkAI Agent
|
||||
description: LinkAI プラットフォームのマルチ Agent Skill を統合
|
||||
---
|
||||
|
||||
[LinkAI](https://link-ai.tech/) プラットフォームの Agent を Skill として使用し、マルチ Agent の意思決定を行います。Agent は Agent 名と説明に基づいてインテリジェントに選択し、`app_code` を通じて対応するアプリケーションやワークフローを呼び出します。
|
||||
|
||||
## 依存関係
|
||||
|
||||
| 依存関係 | 説明 |
|
||||
| --- | --- |
|
||||
| `LINKAI_API_KEY` | LinkAI プラットフォームの API キー。[コンソール](https://link-ai.tech/console/interface)で作成 |
|
||||
| `curl` | システムコマンド(通常プリインストール済み) |
|
||||
|
||||
設定方法:
|
||||
|
||||
- `env_config` Tool で `LINKAI_API_KEY` を設定
|
||||
- または `config.json` で `linkai_api_key` を設定
|
||||
|
||||
## Agent の設定
|
||||
|
||||
`skills/linkai-agent/config.json` で利用可能な Agent を追加します:
|
||||
|
||||
```json
|
||||
{
|
||||
"apps": [
|
||||
{
|
||||
"app_code": "G7z6vKwp",
|
||||
"app_name": "LinkAI Customer Support",
|
||||
"app_description": "Select this assistant only when the user needs help with LinkAI platform questions"
|
||||
},
|
||||
{
|
||||
"app_code": "SFY5x7JR",
|
||||
"app_name": "Content Creator",
|
||||
"app_description": "Use this assistant only when the user needs to create images or videos"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## 使い方
|
||||
|
||||
設定が完了すると、Agent はユーザーの質問に基づいて適切な LinkAI Agent を自動的に選択します。
|
||||
|
||||
<Frame>
|
||||
<img src="https://cdn.link-ai.tech/doc/20260202234350.png" width="750" />
|
||||
</Frame>
|
||||
@@ -1,31 +0,0 @@
|
||||
---
|
||||
title: Skill Creator
|
||||
description: 会話を通じてカスタム Skill を作成
|
||||
---
|
||||
|
||||
自然言語の会話を通じて、Skill の作成、インストール、更新を素早く行えます。
|
||||
|
||||
## 依存関係
|
||||
|
||||
追加の依存関係は不要で、常に利用可能です。
|
||||
|
||||
## 使い方
|
||||
|
||||
- ワークフローを Skill 化:「このデプロイプロセスから Skill を作成して」
|
||||
- サードパーティ API の統合:「この API ドキュメントに基づいて Skill を作成して」
|
||||
- リモート Skill のインストール:「xxx Skill をインストールして」
|
||||
|
||||
## 作成フロー
|
||||
|
||||
1. 作成したい Skill を Agent に伝えます
|
||||
2. Agent が自動的に `SKILL.md` の説明と実行スクリプトを生成します
|
||||
3. Skill はワークスペースの `~/cow/skills/` ディレクトリに保存されます
|
||||
4. 以降の会話で Agent が自動的にその Skill を認識し使用します
|
||||
|
||||
<Frame>
|
||||
<img src="https://cdn.link-ai.tech/doc/20260202202247.png" width="800" />
|
||||
</Frame>
|
||||
|
||||
<Tip>
|
||||
詳細は [Skill Creator のドキュメント](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/skills/skill-creator/SKILL.md)をご覧ください。
|
||||
</Tip>
|
||||
@@ -1,31 +0,0 @@
|
||||
---
|
||||
title: Web Fetch
|
||||
description: Web ページのテキストコンテンツを取得
|
||||
---
|
||||
|
||||
curl を使用して Web ページを取得し、読み取り可能なテキストコンテンツを抽出します。ブラウザ自動化を必要としない軽量な Web アクセス方法です。
|
||||
|
||||
## 依存関係
|
||||
|
||||
| 依存関係 | 説明 |
|
||||
| --- | --- |
|
||||
| `curl` | システムコマンド(通常プリインストール済み) |
|
||||
|
||||
この Skill は `always: true` が設定されており、システムに `curl` コマンドがあればデフォルトで有効になります。
|
||||
|
||||
## 使い方
|
||||
|
||||
Agent が URL からコンテンツを取得する必要がある場合に自動的に呼び出されます。追加の設定は不要です。
|
||||
|
||||
## browser Tool との比較
|
||||
|
||||
| 機能 | web-fetch (Skill) | browser (Tool) |
|
||||
| --- | --- | --- |
|
||||
| 依存関係 | curl のみ | browser-use + playwright |
|
||||
| JS レンダリング | 非対応 | 対応 |
|
||||
| ページ操作 | 非対応 | クリック、入力などに対応 |
|
||||
| 最適な用途 | 静的ページのテキスト | 動的な Web ページ |
|
||||
|
||||
<Tip>
|
||||
ほとんどの Web コンテンツ取得シナリオでは、web-fetch で十分です。JS レンダリングやページ操作が必要な場合にのみ browser Tool を使用してください。
|
||||
</Tip>
|
||||
80
docs/memory/context.mdx
Normal file
80
docs/memory/context.mdx
Normal file
@@ -0,0 +1,80 @@
|
||||
---
|
||||
title: 短期记忆
|
||||
description: 对话上下文 — 消息管理、压缩策略和上下文操作
|
||||
---
|
||||
|
||||
对话上下文是 Agent 的短期记忆,包含当前会话中的所有消息(用户输入、Agent 回复、工具调用及结果)。合理管理上下文对于 Agent 的推理质量和成本控制至关重要。
|
||||
|
||||
## 上下文结构
|
||||
|
||||
每一轮对话由以下消息组成:
|
||||
|
||||
```
|
||||
用户消息 → Agent 思考 → 工具调用 → 工具结果 → ... → Agent 最终回复
|
||||
```
|
||||
|
||||
一轮中可能包含多次工具调用(Agent 的决策步数由 `agent_max_steps` 控制),所有工具调用和结果都会保留在上下文中,直到被压缩或裁剪。
|
||||
|
||||
## 关键配置
|
||||
|
||||
| 参数 | 说明 | 默认值 |
|
||||
| --- | --- | --- |
|
||||
| `agent_max_context_tokens` | 上下文最大 token 预算 | `50000` |
|
||||
| `agent_max_context_turns` | 上下文最大对话轮次 | `20` |
|
||||
| `agent_max_steps` | 单轮对话最大决策步数(工具调用次数) | `15` |
|
||||
|
||||
可通过 `config.json` 或对话中的 `/config` 命令修改。
|
||||
|
||||
## 压缩策略
|
||||
|
||||
当上下文超出限制时,系统会自动执行压缩以释放空间。整个过程分为多个阶段:
|
||||
|
||||
### 1. 工具结果截断
|
||||
|
||||
在每次决策循环开始前,系统会检查历史轮次中的工具调用结果。超过 **20000 字符** 的工具结果会被截断,仅保留首尾内容和截断说明。当前轮次的工具结果不受影响。
|
||||
|
||||
### 2. 轮次裁剪
|
||||
|
||||
当对话轮次超过 `agent_max_context_turns` 时:
|
||||
|
||||
- 裁剪 **最早一半** 的完整轮次(保证工具调用链的完整性)
|
||||
- 被裁剪的消息会通过 LLM 总结后**写入当天的日级记忆文件**
|
||||
- 剩余轮次保持不变
|
||||
|
||||
### 3. Token 预算裁剪
|
||||
|
||||
裁剪轮次后,如果 token 数仍超出预算:
|
||||
|
||||
- **轮次 < 5 时**:对所有轮次进行**文本压缩** — 每轮只保留第一条用户文本和最后一条 Agent 回复,去掉中间的工具调用链
|
||||
- **轮次 ≥ 5 时**:再次裁剪**前半轮次**,被丢弃内容同样写入记忆
|
||||
|
||||
### 4. 溢出应急处理
|
||||
|
||||
当模型 API 返回上下文溢出错误时:
|
||||
|
||||
1. 先将当前所有消息总结写入记忆
|
||||
2. 执行激进裁剪(工具结果限制 10K 字符、用户文本限制 10K、最多保留 5 轮)
|
||||
3. 如果仍然溢出,清空整个对话上下文
|
||||
|
||||
## 会话持久化
|
||||
|
||||
对话消息会持久化到本地数据库,服务重启后自动恢复。恢复策略:
|
||||
|
||||
- 恢复最近的 **`max(3, max_context_turns / 6)`** 轮对话
|
||||
- 只保留每轮的**用户文本和 Agent 最终回复**,不恢复中间工具调用链
|
||||
- 超过 **30 天**的历史会话自动清理
|
||||
|
||||
## 操作命令
|
||||
|
||||
在对话中可以使用以下命令管理上下文:
|
||||
|
||||
| 命令 | 说明 |
|
||||
| --- | --- |
|
||||
| `/context` | 查看当前上下文统计(消息数、角色分布、总字符数) |
|
||||
| `/context clear` | 清空当前会话上下文 |
|
||||
| `/config agent_max_context_tokens 80000` | 调整上下文 token 预算 |
|
||||
| `/config agent_max_context_turns 30` | 调整上下文轮次上限 |
|
||||
|
||||
<Tip>
|
||||
清空上下文后,Agent 会"忘记"之前的对话内容。被裁剪和清空的内容如果已经写入长期记忆,仍可通过记忆检索找回。
|
||||
</Tip>
|
||||
@@ -1,30 +1,39 @@
|
||||
---
|
||||
title: 长期记忆
|
||||
description: CowAgent 的长期记忆系统
|
||||
description: CowAgent 的长期记忆系统 — 文件持久化、自动写入与混合检索
|
||||
---
|
||||
|
||||
记忆系统让 Agent 能够长期记住重要信息,在对话中不断积累经验、理解用户偏好,真正实现自主思考和持续成长。
|
||||
长期记忆保存在工作空间文件中,跨会话持久存在。Agent 在对话中通过检索工具按需加载历史记忆,也会在上下文裁剪时自动将对话摘要写入长期记忆。
|
||||
|
||||
## 记忆类型
|
||||
|
||||
### 核心记忆(MEMORY.md)
|
||||
|
||||
存储在 `~/cow/MEMORY.md` 中,包含用户的长期偏好、重要决策、关键事实等不会随时间淡化的信息。每次对话时自动注入系统提示词,作为 Agent 的背景知识。
|
||||
存储在 `~/cow/MEMORY.md` 中,包含用户的长期偏好、重要决策、关键事实等不会随时间淡化的信息。Agent 可通过工具读写此文件来维护长期知识。
|
||||
|
||||
### 天级记忆(memory/YYYY-MM-DD.md)
|
||||
### 日级记忆(memory/YYYY-MM-DD.md)
|
||||
|
||||
存储在 `~/cow/memory/` 目录下,按日期命名(如 `2026-03-08.md`),记录每天的对话摘要和关键事件。仅在首次写入时创建,避免生成空文件。
|
||||
|
||||
## 记忆写入
|
||||
## 自动写入
|
||||
|
||||
Agent 通过以下机制自动将对话内容持久化为天级记忆:
|
||||
Agent 通过以下机制自动将对话内容持久化为长期记忆:
|
||||
|
||||
- **上下文裁剪时** — 当对话轮次或 token 超出配置上限时,批量裁剪最早一半的上下文,并使用 LLM 将被裁剪的内容总结为关键信息写入当天记忆文件
|
||||
- **上下文裁剪时** — 当对话轮次或 token 超出配置上限时,裁剪最早一半的上下文,使用 LLM 将被裁剪的内容总结为关键信息写入当天记忆文件
|
||||
- **每日定时总结** — 每天 23:55 自动触发一次全量总结,防止低活跃日无记忆留存(内容无变化时自动跳过)
|
||||
- **API 上下文溢出时** — 当模型 API 返回上下文溢出错误时,紧急保存当前对话摘要
|
||||
|
||||
所有记忆写入均在后台异步执行(LLM 总结 + 文件写入),不阻塞正常对话回复。
|
||||
|
||||
## 记忆检索
|
||||
|
||||
记忆系统支持混合检索模式:
|
||||
|
||||
- **关键词检索** — 基于 FTS5 全文索引匹配历史记忆,支持 BM25 排序
|
||||
- **向量检索** — 基于 embedding 语义相似度搜索,即使表述不同也能找到相关记忆
|
||||
|
||||
Agent 会在对话中根据需要自动触发记忆检索,将相关历史信息纳入上下文。检索结果按混合评分排序(默认向量权重 0.7、关键词权重 0.3),日级记忆会随时间衰减(半衰期 30 天),核心记忆不衰减。
|
||||
|
||||
## 首次启动
|
||||
|
||||
首次启动 Agent 时,Agent 会主动向用户询问关键信息,并记录至工作空间(默认 `~/cow`)中:
|
||||
@@ -34,33 +43,16 @@ Agent 通过以下机制自动将对话内容持久化为天级记忆:
|
||||
| `system.md` | Agent 的系统提示词和行为设定 |
|
||||
| `user.md` | 用户身份信息和偏好 |
|
||||
| `MEMORY.md` | 核心记忆(长期) |
|
||||
| `memory/YYYY-MM-DD.md` | 天级记忆(按需创建) |
|
||||
| `memory/YYYY-MM-DD.md` | 日级记忆(按需创建) |
|
||||
|
||||
<Frame>
|
||||
<img src="https://cdn.link-ai.tech/doc/20260203000455.png" width="800" />
|
||||
</Frame>
|
||||
|
||||
## 记忆检索
|
||||
|
||||
记忆系统支持混合检索模式:
|
||||
|
||||
- **关键词检索** — 基于关键词匹配历史记忆
|
||||
- **向量检索** — 基于语义相似度搜索,即使表述不同也能找到相关记忆
|
||||
|
||||
Agent 会在对话中根据需要自动触发记忆检索,将相关历史信息纳入上下文。核心记忆(`MEMORY.md`)始终注入系统提示词,天级记忆通过检索按需加载。
|
||||
|
||||
## 相关配置
|
||||
|
||||
```json
|
||||
{
|
||||
"agent_workspace": "~/cow",
|
||||
"agent_max_context_tokens": 40000,
|
||||
"agent_max_context_turns": 20
|
||||
}
|
||||
```
|
||||
|
||||
| 参数 | 说明 | 默认值 |
|
||||
| --- | --- | --- |
|
||||
| `agent_workspace` | 工作空间路径,记忆文件存储在此目录下 | `~/cow` |
|
||||
| `agent_max_context_tokens` | 最大上下文 token 数,超出时裁剪一半并总结写入记忆 | `40000` |
|
||||
| `agent_max_context_turns` | 最大上下文轮次,超出时裁剪一半并总结写入记忆 | `20` |
|
||||
| `agent_max_context_tokens` | 最大上下文 token 数,超出时裁剪并总结写入记忆 | `50000` |
|
||||
| `agent_max_context_turns` | 最大上下文轮次,超出时裁剪并总结写入记忆 | `20` |
|
||||
@@ -5,6 +5,7 @@ description: CowAgent 版本更新历史
|
||||
|
||||
| 版本 | 日期 | 说明 |
|
||||
| --- | --- | --- |
|
||||
| [2.0.5](/releases/v2.0.5) | 2026.04.01 | Cow CLI、Skill Hub 开源、浏览器工具、企微扫码创建、多项优化和修复 |
|
||||
| [2.0.4](/releases/v2.0.4) | 2026.03.22 | 新增个人微信通道、新模型支持、日文文档、脚本重构及多项修复 |
|
||||
| [2.0.3](/releases/v2.0.3) | 2026.03.18 | 新增企微智能机器人和 QQ 通道、支持Coding Plan、新增多个模型、Web端文件处理、记忆系统升级 |
|
||||
| [2.0.2](/releases/v2.0.2) | 2026.02.27 | Web 控制台升级、多通道同时运行、会话持久化 |
|
||||
|
||||
84
docs/releases/v2.0.5.mdx
Normal file
84
docs/releases/v2.0.5.mdx
Normal file
@@ -0,0 +1,84 @@
|
||||
---
|
||||
title: v2.0.5
|
||||
description: CowAgent 2.0.5 - Cow CLI、Skill Hub 开源、浏览器工具、企微扫码创建、DeepSeek 独立模块及多项优化
|
||||
---
|
||||
|
||||
## 🖥️ Cow CLI 命令系统
|
||||
|
||||
新增 Cow CLI 命令系统,支持在终端和对话中执行命令,实现对 CowAgent 的全方位管理:
|
||||
|
||||
- **终端命令**:在系统终端中执行 `cow <命令>`,支持 `start`、`stop`、`restart`、`update`、`status`、`logs` 等服务管理操作
|
||||
- **对话命令**:在对话中输入 `/<命令>` 或 `cow <命令>`,支持 `/help`、`/status`、`/config`、`/skill`、`/context`、`/logs`、`/version` 等
|
||||
- **web控制台**:Web 控制台输入框输入 `/` 即可弹出指令菜单,支持方向键回溯历史输入
|
||||
- **Windows 支持**:新增 PowerShell 一键安装脚本 `scripts/run.ps1`,同时支持 `cow` 命令
|
||||
|
||||
相关文档:[命令总览](https://docs.cowagent.ai/commands)
|
||||
|
||||
<img src="https://cdn.link-ai.tech/doc/20260401114549.png" width="750" />
|
||||
|
||||
## 🧩 Cow Skill Hub 开源
|
||||
|
||||
[Cow Skill Hub](https://skills.cowagent.ai)(技能广场)正式开源并上线,提供 AI Agent 技能的浏览、搜索、安装和发布,汇集精选技能、社区贡献技能、三方技能:
|
||||
|
||||
- **一键安装**:在对话中 `/skill install <名称>` 或终端 `cow skill install <名称>` 一键安装
|
||||
- **多来源支持**:支持安装 Skill Hub、GitHub、ClawHub、LinkAI 上的全部技能,支持 GitHub 批量安装和子目录指定
|
||||
- **技能搜索**:`/skill search` 和 `/skill list --remote` 浏览和搜索技能广场
|
||||
- **技能发布**:通过 [skills.cowagent.ai/submit](https://skills.cowagent.ai/submit) 提交自己的技能
|
||||
- **镜像加速**:支持 Skill Hub 镜像加速,国内环境下载更流畅
|
||||
|
||||
Skill Hub 开源仓库:[cow-skill-hub](https://github.com/zhayujie/cow-skill-hub)。
|
||||
|
||||
相关文档:[技能广场](https://docs.cowagent.ai/skills/hub)、[安装技能](https://docs.cowagent.ai/skills/install)
|
||||
|
||||
<img src="https://cdn.link-ai.tech/doc/20260401110103.png" width="750" />
|
||||
|
||||
|
||||
## 🌐 新增浏览器工具
|
||||
|
||||
新增 Browser 工具,Agent 可控制浏览器访问和操作网页:
|
||||
|
||||
- **网页导航与交互**:支持 `navigate`、`click`、`fill`、`select`、`scroll`、`press` 等操作
|
||||
- **页面快照**:使用精简 DOM 快照技术,让 Agent 高效理解页面结构,导航后自动快照
|
||||
- **截图能力**:支持页面截图保存到工作区
|
||||
- **JavaScript 执行**:支持在页面中执行自定义脚本
|
||||
- **CLI 安装**:通过 `cow install-browser` 一键安装浏览器及依赖,自动适配系统环境
|
||||
- **Docker 支持**:Docker 镜像已内置浏览器安装支持
|
||||
|
||||
相关文档:[浏览器工具](https://docs.cowagent.ai/tools/browser)。
|
||||
|
||||
<img src="https://cdn.link-ai.tech/doc/20260401115728.png" width="750" />
|
||||
|
||||
|
||||
## 🤖 企微智能机器人扫码创建
|
||||
|
||||
企业微信智能机器人通道新增扫码一键创建功能:
|
||||
|
||||
- **Web 控制台扫码**:在 Web 控制台通道页面,选择「扫码接入」模式,使用企业微信扫码即可自动创建并接入智能机器人,无需手动到企业微信后台配置
|
||||
- **手动模式保留**:同时保留「手动填写」模式,可输入已有的 Bot ID 和 Secret 接入
|
||||
- **流式推送优化**:增加推送节流,避免 WebSocket 拥塞
|
||||
|
||||
相关文档:[企微智能机器人接入](https://docs.cowagent.ai/channels/wecom-bot)。
|
||||
|
||||
相关提交:[#2735](https://github.com/zhayujie/chatgpt-on-wechat/pull/2735)
|
||||
|
||||
Thanks [@WecomTeam](https://github.com/WecomTeam)
|
||||
|
||||
## 🐛 其他优化与修复
|
||||
|
||||
- **DeepSeek 独立模块**:新增独立的 DeepSeek Bot 模块,支持 `deepseek_api_key` 专属配置,无需再通过 OpenAI 兼容方式接入([#2719](https://github.com/zhayujie/chatgpt-on-wechat/pull/2719))。Thanks [@6vision](https://github.com/6vision)
|
||||
- **Web 控制台优化**:新增斜杠指令菜单和输入历史回溯,新增模型选项,优化移动端适配([#2731](https://github.com/zhayujie/chatgpt-on-wechat/pull/2731))。Thanks [@zkjqd](https://github.com/zkjqd)
|
||||
- **上下文丢失**:修复上下文裁剪后丢失的问题 ([393f0c0](https://github.com/zhayujie/chatgpt-on-wechat/commit/393f0c0))
|
||||
- **系统提示词**:修复系统提示词未在每轮重建的问题 ([13f5fde](https://github.com/zhayujie/chatgpt-on-wechat/commit/13f5fde))
|
||||
- **Agent 响应**:去除 Agent 响应首尾空白字符 ([f890318](https://github.com/zhayujie/chatgpt-on-wechat/commit/f890318))
|
||||
- **视觉压缩**:优化视觉图片压缩策略 ([22b8ca0](https://github.com/zhayujie/chatgpt-on-wechat/commit/22b8ca0))
|
||||
- **Gemini 模型**:修复 GoogleGeminiBot 缺少 model 属性的问题([#2716](https://github.com/zhayujie/chatgpt-on-wechat/pull/2716))。Thanks [@cowagent](https://github.com/cowagent)
|
||||
- **微信通道**:修复文件发送失败、文件名丢失等问题 ([6d9b7ba](https://github.com/zhayujie/chatgpt-on-wechat/commit/6d9b7ba)、[baf66a1](https://github.com/zhayujie/chatgpt-on-wechat/commit/baf66a1)、[45faa9c](https://github.com/zhayujie/chatgpt-on-wechat/commit/45faa9c))
|
||||
- **Docker 优化**:修复卷权限问题,精简镜像体积 ([3eb8348](https://github.com/zhayujie/chatgpt-on-wechat/commit/3eb8348)、[4470d4c](https://github.com/zhayujie/chatgpt-on-wechat/commit/4470d4c))
|
||||
- **README 排版**:优化中英文排版空格([#2723](https://github.com/zhayujie/chatgpt-on-wechat/pull/2723))。Thanks [@Xiaozhou345](https://github.com/Xiaozhou345)
|
||||
- **安全修复**:修复 Memory Content路径遍历风险,Thanks [@August829](https://github.com/August829)
|
||||
|
||||
## 📦 升级方式
|
||||
|
||||
源码部署可执行 `cow update` 或 `./run.sh update` 一键升级,或手动拉取代码后重启。详见 [更新升级文档](https://docs.cowagent.ai/guide/upgrade)。
|
||||
|
||||
**发布日期**:2026.04.01 | [Full Changelog](https://github.com/zhayujie/chatgpt-on-wechat/compare/2.0.4...master)
|
||||
58
docs/skills/create.mdx
Normal file
58
docs/skills/create.mdx
Normal file
@@ -0,0 +1,58 @@
|
||||
---
|
||||
title: 创造技能
|
||||
description: 通过对话创建自定义技能
|
||||
---
|
||||
|
||||
CowAgent 内置了 Skill Creator,可以通过自然语言对话快速创建、安装或更新技能。
|
||||
|
||||
## 使用方式
|
||||
|
||||
直接在对话中描述你想要的技能,Agent 会自动完成创建:
|
||||
|
||||
- 将工作流程固化为技能:"帮我把这个部署流程创建为一个技能"
|
||||
- 对接第三方 API:"根据这个接口文档创建一个技能"
|
||||
- 安装远程技能:"帮我安装 xxx 技能"
|
||||
|
||||
## 创建流程
|
||||
|
||||
1. 告诉 Agent 你想创建的技能功能
|
||||
2. Agent 自动生成 `SKILL.md` 说明文件和运行脚本
|
||||
3. 技能保存到工作空间的 `~/cow/skills/` 目录
|
||||
4. 后续对话中 Agent 会自动识别并使用该技能
|
||||
|
||||
<Frame>
|
||||
<img src="https://cdn.link-ai.tech/doc/20260202202247.png" width="800" />
|
||||
</Frame>
|
||||
|
||||
## SKILL.md 格式
|
||||
|
||||
创建的技能遵循标准的 SKILL.md 格式:
|
||||
|
||||
```markdown
|
||||
---
|
||||
name: my-skill
|
||||
description: Brief description of the skill
|
||||
metadata:
|
||||
emoji: 🔧
|
||||
requires:
|
||||
bins: ["curl"]
|
||||
env: ["MY_API_KEY"]
|
||||
primaryEnv: "MY_API_KEY"
|
||||
---
|
||||
|
||||
# My Skill
|
||||
|
||||
Detailed instructions...
|
||||
```
|
||||
|
||||
| 字段 | 说明 |
|
||||
| --- | --- |
|
||||
| `name` | 技能名称,需与目录名一致 |
|
||||
| `description` | 技能描述,Agent 据此决定是否调用 |
|
||||
| `metadata.requires.bins` | 依赖的系统命令 |
|
||||
| `metadata.requires.env` | 依赖的环境变量 |
|
||||
| `metadata.always` | 是否始终加载(默认 false) |
|
||||
|
||||
<Tip>
|
||||
详细开发文档可参考 [Skill Creator 说明](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/skills/skill-creator/SKILL.md)。
|
||||
</Tip>
|
||||
65
docs/skills/hub.mdx
Normal file
65
docs/skills/hub.mdx
Normal file
@@ -0,0 +1,65 @@
|
||||
---
|
||||
title: 技能广场
|
||||
description: 浏览、搜索和安装 AI Agent 技能
|
||||
---
|
||||
|
||||
[Cow Skill Hub](https://skills.cowagent.ai/) 是开源的 AI Agent 技能广场,汇集了官方推荐、社区贡献和第三方平台(GitHub、ClawHub 等)的技能。
|
||||
|
||||
开源仓库:[github.com/zhayujie/cow-skill-hub](https://github.com/zhayujie/cow-skill-hub)
|
||||
|
||||
<img src="https://cdn.link-ai.tech/doc/20260401110103.png" width="800" />
|
||||
|
||||
## 功能
|
||||
|
||||
- **浏览技能**:按类别(推荐 / 社区 / 第三方)和标签筛选
|
||||
- **搜索技能**:按名称或描述搜索
|
||||
- **查看详情**:查看技能文档、文件内容、安装命令和依赖的环境变量
|
||||
- **一键安装**:复制安装命令即可在 CowAgent 中使用
|
||||
|
||||
## 安装技能
|
||||
|
||||
在对话中或终端中执行安装命令:
|
||||
|
||||
<CodeGroup>
|
||||
```text 对话
|
||||
/skill install <name>
|
||||
```
|
||||
|
||||
```bash 终端
|
||||
cow skill install <name>
|
||||
```
|
||||
</CodeGroup>
|
||||
|
||||
也可以在对话中浏览技能广场:
|
||||
|
||||
```text
|
||||
/skill list --remote
|
||||
/skill search <关键词>
|
||||
```
|
||||
|
||||
除了在列表中展示的精选技能,还可以通过 **CLI命令 + Skill Hub** 安装各种第三方技能(**GitHub、ClawHub、LinkAI、URL** 等)参考 [安装技能](/skills/install)。
|
||||
|
||||
## 贡献技能
|
||||
|
||||
欢迎向技能广场提交你的技能:
|
||||
|
||||
1. 访问 [skills.cowagent.ai/submit](https://skills.cowagent.ai/submit)
|
||||
2. 使用 GitHub 或 Google 账号登录
|
||||
3. 上传包含 `SKILL.md` 的文件夹或 zip 包
|
||||
4. 自动解析技能名称、显示名称和描述,可按需修改
|
||||
5. 提交后将经过安全检查和审核后发布
|
||||
|
||||
<img src="https://cdn.link-ai.tech/doc/20260401111904.png" width="800" />
|
||||
|
||||
技能文件结构:
|
||||
|
||||
```
|
||||
your-skill/
|
||||
├── SKILL.md # 必须,放在根目录
|
||||
├── scripts/ # 可选,运行脚本
|
||||
└── resources/ # 可选,其他资源
|
||||
```
|
||||
|
||||
<Tip>
|
||||
技能基于 `SKILL.md` 文件构建,你也可以在技能详情页下载 SKILL.md,用于任何支持自定义指令的 Agent(如 OpenClaw、Cursor、Claude Code 等)。
|
||||
</Tip>
|
||||
@@ -1,31 +0,0 @@
|
||||
---
|
||||
title: 图像识别
|
||||
description: 使用 OpenAI 视觉模型识别图片
|
||||
---
|
||||
|
||||
使用 OpenAI 的 GPT-4 Vision API 分析图片内容,理解图像中的物体、文字、颜色等元素。
|
||||
|
||||
## 依赖
|
||||
|
||||
| 依赖 | 说明 |
|
||||
| --- | --- |
|
||||
| `OPENAI_API_KEY` | OpenAI API 密钥 |
|
||||
| `curl`、`base64` | 系统命令(通常已预装) |
|
||||
|
||||
配置方式:
|
||||
|
||||
- 通过 `env_config` 工具配置 `OPENAI_API_KEY`
|
||||
- 或在 `config.json` 中填写 `open_ai_api_key`
|
||||
|
||||
## 支持的模型
|
||||
|
||||
- `gpt-4.1-mini`(推荐,性价比高)
|
||||
- `gpt-4.1`
|
||||
|
||||
## 使用方式
|
||||
|
||||
配置完成后,向 Agent 发送图片即可自动触发图像识别。
|
||||
|
||||
<Frame>
|
||||
<img src="https://cdn.link-ai.tech/doc/20260202213219.png" width="800" />
|
||||
</Frame>
|
||||
@@ -7,20 +7,18 @@ description: CowAgent 技能系统介绍
|
||||
|
||||
Skill 与 Tool 的区别:Tool 是由代码实现的原子操作(如读写文件、执行命令),Skill 则是基于说明文件的高级工作流,可以组合调用多个 Tool 来完成复杂任务。
|
||||
|
||||
## 内置技能
|
||||
## 获取技能
|
||||
|
||||
位于项目 `skills/` 目录下,根据依赖条件自动判断是否启用:
|
||||
CowAgent 提供多种方式获取技能:
|
||||
|
||||
| 技能 | 说明 | 依赖 |
|
||||
| --- | --- | --- |
|
||||
| [`skill-creator`](/skills/skill-creator) | 通过对话创建自定义技能 | 无 |
|
||||
| [`openai-image-vision`](/skills/image-vision) | 使用 OpenAI 视觉模型识别图片 | `OPENAI_API_KEY` |
|
||||
| [`linkai-agent`](/skills/linkai-agent) | 对接 LinkAI 平台智能体 | `LINKAI_API_KEY` |
|
||||
| [`web-fetch`](/skills/web-fetch) | 抓取网页文本内容 | `curl`(默认启用) |
|
||||
- **[Cow 技能广场](https://skills.cowagent.ai/)** — 在线浏览所有可用技能,或通过 `/skill list --remote` 在对话中浏览和安装
|
||||
- **GitHub** — 直接从 GitHub 仓库安装,支持批量安装
|
||||
- **ClawHub** — 通过 `/skill install clawhub:名称` 安装 ClawHub 上的技能 (4w+个)
|
||||
- **LinkA** — 通过 `/skill install linkai:编码` 安装 LinkAI 上的公开资源和创建的知识库/数据库/工作流/插件等资源
|
||||
- **URL** — 从 zip 压缩包或 SKILL.md 链接安装
|
||||
- **对话创建** — 通过自然语言对话让 Agent 自动创建技能
|
||||
|
||||
## 自定义技能
|
||||
|
||||
由用户通过对话创建,存放在工作空间中(`~/cow/skills/`),可实现任何复杂的业务流程和第三方系统对接。
|
||||
详细安装方式参考 [安装技能](/skills/install) 和 [技能管理命令](/commands/skill)。也可以通过对话 [创建技能](/skills/create),或向 [Skill Hub](https://skills.cowagent.ai/submit) 贡献你的技能。
|
||||
|
||||
## 技能加载优先级
|
||||
|
||||
|
||||
66
docs/skills/install.mdx
Normal file
66
docs/skills/install.mdx
Normal file
@@ -0,0 +1,66 @@
|
||||
---
|
||||
title: 安装技能
|
||||
description: 通过命令一键安装来自多种来源的技能
|
||||
---
|
||||
|
||||
CowAgent 支持通过统一的 `install` 命令安装来自 **[Cow 技能广场](https://skills.cowagent.ai/)、GitHub、ClawHub、LinkAI** 以及任意 URL 上的技能。在对话中使用 `/skill install`,在终端中使用 `cow skill install`。
|
||||
|
||||
## 从Cow技能广场安装
|
||||
|
||||
访问 [skills.cowagent.ai](https://skills.cowagent.ai/) 浏览所有可用技能,找到想要的技能后直接安装,例如:
|
||||
|
||||
```text
|
||||
/skill list --remote
|
||||
/skill install pptx
|
||||
```
|
||||
|
||||
## 从 GitHub 安装
|
||||
|
||||
> Github上的所有技能都可以直接安装,支持仓库级批量安装和指定子目录安装,例如:
|
||||
|
||||
```text
|
||||
/skill install larksuite/cli
|
||||
/skill install https://github.com/larksuite/cli/tree/main/skills/lark-im
|
||||
```
|
||||
|
||||
## 从 ClawHub 安装
|
||||
|
||||
[ClawHub](https://clawhub.ai/) 上的所有技能 (4w+个) 都可以一键安装,例如:
|
||||
|
||||
|
||||
```text
|
||||
/skill install clawhub:<name>
|
||||
```
|
||||
|
||||
## 从 LinkAI 安装
|
||||
|
||||
[LinkAI](https://link-ai.tech/console) 上的所有公开资源 (1w+个插件/应用/工作流) ,以及自己创建的资源 (应用/工作流/知识库/数据库/插件) 都可以通过命令一键安装:
|
||||
|
||||
```text
|
||||
/skill install linkai:<code>
|
||||
```
|
||||
|
||||
> LinkAI平台上创建的所有应用、工作流、知识库、数据库、插件都有唯一的code,可在[控制台](https://link-ai.tech/console)各资源页面中进行获取并填写到命令中
|
||||
|
||||
## 从 URL 安装
|
||||
|
||||
支持 zip 压缩包和 SKILL.md 文件链接:
|
||||
|
||||
```text
|
||||
/skill install https://cdn.link-ai.tech/skills/pptx.zip
|
||||
/skill install https://example.com/path/to/SKILL.md
|
||||
```
|
||||
|
||||
## 管理技能
|
||||
|
||||
```text
|
||||
/skill list # 查看已安装技能
|
||||
/skill info pptx # 查看技能详情
|
||||
/skill enable pptx # 启用技能
|
||||
/skill disable pptx # 禁用技能
|
||||
/skill uninstall pptx # 卸载技能
|
||||
```
|
||||
|
||||
<Tip>
|
||||
以上所有命令在终端中使用时,将 `/skill` 替换为 `cow skill` 即可。完整命令说明参考 [技能管理命令](/commands/skill)。
|
||||
</Tip>
|
||||
@@ -1,47 +0,0 @@
|
||||
---
|
||||
title: LinkAI 智能体
|
||||
description: 对接 LinkAI 平台的多智能体技能
|
||||
---
|
||||
|
||||
将 [LinkAI](https://link-ai.tech/) 平台上的智能体作为 Skill 使用,实现多智能体决策。Agent 根据智能体的名称和描述智能选择,通过 `app_code` 调用对应的应用或工作流。
|
||||
|
||||
## 依赖
|
||||
|
||||
| 依赖 | 说明 |
|
||||
| --- | --- |
|
||||
| `LINKAI_API_KEY` | LinkAI 平台 API 密钥,在 [控制台](https://link-ai.tech/console/interface) 创建 |
|
||||
| `curl` | 系统命令(通常已预装) |
|
||||
|
||||
配置方式:
|
||||
|
||||
- 通过 `env_config` 工具配置 `LINKAI_API_KEY`
|
||||
- 或在 `config.json` 中填写 `linkai_api_key`
|
||||
|
||||
## 配置智能体
|
||||
|
||||
在 `skills/linkai-agent/config.json` 中添加可用的智能体:
|
||||
|
||||
```json
|
||||
{
|
||||
"apps": [
|
||||
{
|
||||
"app_code": "G7z6vKwp",
|
||||
"app_name": "LinkAI客服助手",
|
||||
"app_description": "当用户需要了解LinkAI平台相关问题时才选择该助手"
|
||||
},
|
||||
{
|
||||
"app_code": "SFY5x7JR",
|
||||
"app_name": "内容创作助手",
|
||||
"app_description": "当用户需要创作图片或视频时才使用该助手"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## 使用方式
|
||||
|
||||
配置完成后,Agent 会根据用户的问题自动选择合适的 LinkAI 智能体进行回答。
|
||||
|
||||
<Frame>
|
||||
<img src="https://cdn.link-ai.tech/doc/20260202234350.png" width="750" />
|
||||
</Frame>
|
||||
@@ -1,31 +0,0 @@
|
||||
---
|
||||
title: 创建技能
|
||||
description: 通过对话创建自定义技能
|
||||
---
|
||||
|
||||
通过自然语言对话快速创建、安装或更新技能。
|
||||
|
||||
## 依赖
|
||||
|
||||
无额外依赖,始终可用。
|
||||
|
||||
## 使用方式
|
||||
|
||||
- 将工作流程固化为技能:"帮我把这个部署流程创建为一个技能"
|
||||
- 对接第三方 API:"根据这个接口文档创建一个技能"
|
||||
- 安装远程技能:"帮我安装 xxx 技能"
|
||||
|
||||
## 创建流程
|
||||
|
||||
1. 告诉 Agent 你想创建的技能功能
|
||||
2. Agent 自动生成 `SKILL.md` 说明文件和运行脚本
|
||||
3. 技能保存到工作空间的 `~/cow/skills/` 目录
|
||||
4. 后续对话中 Agent 会自动识别并使用该技能
|
||||
|
||||
<Frame>
|
||||
<img src="https://cdn.link-ai.tech/doc/20260202202247.png" width="800" />
|
||||
</Frame>
|
||||
|
||||
<Tip>
|
||||
详细开发文档可参考 [Skill 创造器说明](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/skills/skill-creator/SKILL.md)。
|
||||
</Tip>
|
||||
@@ -1,31 +0,0 @@
|
||||
---
|
||||
title: 网页抓取
|
||||
description: 抓取网页文本内容
|
||||
---
|
||||
|
||||
使用 curl 抓取网页并提取可读文本内容,轻量级的网页访问方式,无需浏览器自动化。
|
||||
|
||||
## 依赖
|
||||
|
||||
| 依赖 | 说明 |
|
||||
| --- | --- |
|
||||
| `curl` | 系统命令(通常已预装) |
|
||||
|
||||
该技能设置了 `always: true`,只要系统有 `curl` 命令即默认启用。
|
||||
|
||||
## 使用方式
|
||||
|
||||
当 Agent 需要获取某个 URL 的网页内容时会自动调用,无需额外配置。
|
||||
|
||||
## 与 browser 工具的区别
|
||||
|
||||
| 特性 | web-fetch(技能) | browser(工具) |
|
||||
| --- | --- | --- |
|
||||
| 依赖 | 仅 curl | browser-use + playwright |
|
||||
| JS 渲染 | 不支持 | 支持 |
|
||||
| 页面交互 | 不支持 | 支持点击、输入等 |
|
||||
| 适用场景 | 获取静态页面文本 | 操作动态网页 |
|
||||
|
||||
<Tip>
|
||||
对于大多数网页内容获取场景,web-fetch 就够用了。只有需要 JS 渲染或页面交互时才需要 browser 工具。
|
||||
</Tip>
|
||||
@@ -1,25 +1,109 @@
|
||||
---
|
||||
title: browser - 浏览器
|
||||
description: 访问和操作网页
|
||||
description: 控制浏览器访问和操作网页
|
||||
---
|
||||
|
||||
使用浏览器访问和操作网页,支持 JavaScript 渲染的动态页面。
|
||||
控制 Chromium 浏览器进行网页导航、元素交互和内容提取。支持 JavaScript 渲染的动态页面,使用精简 DOM 快照让 Agent 高效理解页面结构。
|
||||
|
||||
## 依赖
|
||||
## 安装
|
||||
|
||||
| 依赖 | 安装命令 |
|
||||
| --- | --- |
|
||||
| `browser-use` ≥ 0.1.40 | `pip install browser-use` |
|
||||
| `markdownify` | `pip install markdownify` |
|
||||
| `playwright` + chromium | `pip install playwright && playwright install chromium` |
|
||||
<Tabs>
|
||||
<Tab title="CLI 安装(推荐)">
|
||||
```bash
|
||||
cow install-browser
|
||||
```
|
||||
|
||||
该命令会自动完成:
|
||||
- 安装 `playwright` Python 包(旧系统自动降级兼容版本)
|
||||
- 在 Linux 上安装系统依赖
|
||||
- 下载 Chromium 浏览器(Linux 服务器自动使用无头精简版)
|
||||
- 自动检测国内网络并使用镜像加速
|
||||
</Tab>
|
||||
<Tab title="手动安装">
|
||||
```bash
|
||||
pip install playwright
|
||||
playwright install chromium
|
||||
```
|
||||
|
||||
Linux 服务器还需安装系统依赖:
|
||||
```bash
|
||||
sudo playwright install-deps chromium
|
||||
```
|
||||
|
||||
如果系统较旧(如 Ubuntu 18.04,glibc < 2.28),需安装兼容版本:
|
||||
```bash
|
||||
pip install playwright==1.28.0
|
||||
python -m playwright install chromium
|
||||
```
|
||||
|
||||
国内网络下载 Chromium 较慢,可设置镜像加速:
|
||||
```bash
|
||||
export PLAYWRIGHT_DOWNLOAD_HOST=https://registry.npmmirror.com/-/binary/playwright
|
||||
python -m playwright install chromium
|
||||
```
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
<Note>
|
||||
支持 Ubuntu 20.04+、Debian 10+、macOS、Windows。Ubuntu 18.04 等旧系统会自动降级安装兼容版本。
|
||||
</Note>
|
||||
|
||||
## 工作流程
|
||||
|
||||
Agent 使用浏览器的典型流程:
|
||||
|
||||
1. **`navigate`** — 打开目标 URL
|
||||
2. **`snapshot`** — 获取页面精简 DOM,交互元素自动编号(ref)
|
||||
3. **`click` / `fill` / `select`** — 通过 ref 编号操作元素
|
||||
4. **`snapshot`** — 再次快照验证操作结果
|
||||
|
||||
## 支持的操作
|
||||
|
||||
| 操作 | 说明 | 关键参数 |
|
||||
| --- | --- | --- |
|
||||
| `navigate` | 打开 URL | `url` |
|
||||
| `snapshot` | 获取页面结构化文本(主要方式) | `selector`(可选) |
|
||||
| `click` | 点击元素 | `ref` 或 `selector` |
|
||||
| `fill` | 填入文本 | `ref` 或 `selector`,`text` |
|
||||
| `select` | 下拉选择 | `ref` 或 `selector`,`value` |
|
||||
| `scroll` | 滚动页面 | `direction`(up/down/left/right) |
|
||||
| `screenshot` | 截图保存到工作区 | `full_page` |
|
||||
| `wait` | 等待元素或超时 | `selector`,`timeout` |
|
||||
| `press` | 按键(Enter、Tab 等) | `key` |
|
||||
| `back` / `forward` | 浏览器前进/后退 | - |
|
||||
| `get_text` | 获取元素文本内容 | `selector` |
|
||||
| `evaluate` | 执行 JavaScript | `script` |
|
||||
|
||||
## 使用场景
|
||||
|
||||
- 访问指定 URL 获取页面内容
|
||||
- 操作网页元素(点击、输入等)
|
||||
- 访问指定 URL 获取动态页面内容
|
||||
- 填写表单、登录操作
|
||||
- 操作网页元素(点击按钮、选择选项等)
|
||||
- 验证部署后的网页效果
|
||||
- 抓取需要 JS 渲染的动态内容
|
||||
|
||||
## 运行模式
|
||||
|
||||
浏览器会根据运行环境自动选择模式:
|
||||
|
||||
| 环境 | 模式 |
|
||||
| --- | --- |
|
||||
| macOS / Windows | 有头模式(显示浏览器窗口) |
|
||||
| Linux 桌面(有 DISPLAY) | 有头模式 |
|
||||
| Linux 服务器(无 DISPLAY) | 无头模式(headless) |
|
||||
|
||||
可在 `config.json` 中手动覆盖:
|
||||
|
||||
```json
|
||||
{
|
||||
"tools": {
|
||||
"browser": {
|
||||
"headless": true
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
<Note>
|
||||
浏览器工具依赖较重,如不需要可不安装。轻量的网页内容获取可使用 `web-fetch` 技能。
|
||||
浏览器工具依赖较重(~300MB),如不需要可不安装。轻量的网页内容获取可使用 `web_fetch` 工具。
|
||||
</Note>
|
||||
|
||||
@@ -31,6 +31,15 @@ description: CowAgent 内置工具系统
|
||||
<Card title="memory - 记忆" icon="brain" href="/tools/memory">
|
||||
搜索和读取长期记忆
|
||||
</Card>
|
||||
<Card title="env_config - 环境变量" icon="key" href="/tools/env-config">
|
||||
管理 API Key 等秘钥配置
|
||||
</Card>
|
||||
<Card title="web_fetch - 网页获取" icon="globe" href="/tools/web-fetch">
|
||||
获取网页或文档内容
|
||||
</Card>
|
||||
<Card title="scheduler - 定时任务" icon="clock" href="/tools/scheduler">
|
||||
创建和管理定时任务
|
||||
</Card>
|
||||
</CardGroup>
|
||||
|
||||
## 可选工具
|
||||
@@ -38,13 +47,13 @@ description: CowAgent 内置工具系统
|
||||
以下工具需要安装额外依赖或配置 API Key 后启用:
|
||||
|
||||
<CardGroup cols={2}>
|
||||
<Card title="env_config - 环境变量" icon="key" href="/tools/env-config">
|
||||
管理 API Key 等秘钥配置
|
||||
</Card>
|
||||
<Card title="scheduler - 定时任务" icon="clock" href="/tools/scheduler">
|
||||
创建和管理定时任务
|
||||
</Card>
|
||||
<Card title="web_search - 联网搜索" icon="magnifying-glass" href="/tools/web-search">
|
||||
搜索互联网获取实时信息
|
||||
</Card>
|
||||
<Card title="vision - 图片分析" icon="eye" href="/tools/vision">
|
||||
分析图片内容(识别、描述、OCR 文字提取等)
|
||||
</Card>
|
||||
<Card title="browser - 浏览器" icon="window" href="/tools/browser">
|
||||
控制浏览器访问和操作网页
|
||||
</Card>
|
||||
</CardGroup>
|
||||
|
||||
36
docs/tools/vision.mdx
Normal file
36
docs/tools/vision.mdx
Normal file
@@ -0,0 +1,36 @@
|
||||
---
|
||||
title: vision - 图片分析
|
||||
description: 分析图片内容(识别、描述、OCR 等)
|
||||
---
|
||||
|
||||
使用 Vision API 分析本地图片或图片 URL,支持内容描述、文字提取(OCR)、物体识别等。
|
||||
|
||||
## 依赖
|
||||
|
||||
需要配置至少一个 API Key(通过 `env_config` 工具或工作空间 `.env` 文件配置):
|
||||
|
||||
| 后端 | 环境变量 | 优先级 |
|
||||
| --- | --- | --- |
|
||||
| OpenAI | `OPENAI_API_KEY` | 优先使用 |
|
||||
| LinkAI | `LINKAI_API_KEY` | 备选 |
|
||||
|
||||
## 参数
|
||||
|
||||
| 参数 | 类型 | 必填 | 说明 |
|
||||
| --- | --- | --- | --- |
|
||||
| `image` | string | 是 | 本地文件路径或 HTTP(S) 图片 URL |
|
||||
| `question` | string | 是 | 对图片提出的问题 |
|
||||
| `model` | string | 否 | 模型名称(默认 gpt-4.1-mini) |
|
||||
|
||||
支持的图片格式:jpg、jpeg、png、gif、webp
|
||||
|
||||
## 使用场景
|
||||
|
||||
- 描述图片中的内容
|
||||
- 提取图片中的文字(OCR)
|
||||
- 识别物体、颜色、场景
|
||||
- 分析截图、文档扫描件
|
||||
|
||||
<Note>
|
||||
超过 1MB 的图片会自动压缩后上传。如果未配置任何 Vision API Key,该工具不会被加载。
|
||||
</Note>
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user