Compare commits

...

17 Commits

Author SHA1 Message Date
zhayujie
6db22827f2 feat: docs update 2026-02-27 16:03:47 +08:00
zhayujie
d891312032 docs: init docs 2026-02-27 12:10:16 +08:00
zhayujie
3ddbdd713d Merge branch 'master' of github.com:zhayujie/chatgpt-on-wechat 2026-02-26 18:57:43 +08:00
zhayujie
9ba107b511 Merge branch 'feat-multi-channel' 2026-02-26 18:57:19 +08:00
zhayujie
c9adddb76a fix: pass channel_type correctly in multi-channel mode 2026-02-26 18:57:08 +08:00
zhayujie
f0a12d5ff5 Merge pull request #2678 from zhayujie/feat-multi-channel
feat: support multi-channel
2026-02-26 18:34:48 +08:00
zhayujie
7cce224499 feat: support multi-channel 2026-02-26 18:34:08 +08:00
zhayujie
97397ca585 Merge pull request #2674 from haosenwang1018/fix/bare-excepts
fix: replace 29 bare except clauses with except Exception
2026-02-26 12:11:49 +08:00
zhayujie
f2fbc602a8 Merge branch 'master' of github.com:zhayujie/chatgpt-on-wechat 2026-02-26 10:45:01 +08:00
zhayujie
925d728a86 fix: replace upsert syntax to support SQLite lower version 2026-02-26 10:44:04 +08:00
zhayujie
f5f229871b Merge pull request #2676 from zhayujie/feat-multi-channel
feat: improve web console and conversation store
2026-02-26 10:37:03 +08:00
zhayujie
9917552b4b fix: improve web UI stability and conversation history restore
- Fix dark mode FOUC: apply theme in <head> before first paint, defer
  transition-colors to post-init to avoid animated flash on load
- Fix Safari IME Enter bug: defer compositionend reset via setTimeout(0)
- Fix history scroll: use requestAnimationFrame before scrollChatToBottom
- Limit restore turns to min(6, max_turns//3) on restart
- Fix load_messages cutoff to start at turn boundary, preventing orphaned
  tool_use/tool_result pairs from being sent to the LLM
- Merge all assistant messages within one user turn into a single bubble;
  render tool_calls in history using same CSS as live SSE view
- Handle empty choices list in stream chunks
2026-02-26 10:35:20 +08:00
haosenwang1018
adca89b973 fix: replace bare except clauses with except Exception
Bare `except:` catches BaseException including KeyboardInterrupt and
SystemExit. Replaced 29 instances with `except Exception:`.
2026-02-25 11:49:19 +00:00
zhayujie
29bfbecdc9 feat: persistent storage of conversation history 2026-02-25 18:01:39 +08:00
zhayujie
1a7a8c98d9 docs: add scam warning disclaimer 2026-02-25 01:34:16 +08:00
zhayujie
cddb38ac3d Merge pull request #2673 from zhayujie/feat-web-console
feat: web console
2026-02-24 00:06:29 +08:00
zhayujie
d610608391 feat: add cloud host config 2026-02-23 15:06:31 +08:00
125 changed files with 4945 additions and 343 deletions

View File

@@ -24,8 +24,9 @@
## 声明
1. 本项目遵循 [MIT开源协议](/LICENSE),主要用于技术研究和学习,使用本项目时需遵守所在地法律法规、相关政策以及企业章程,禁止用于任何违法或侵犯他人权益的行为。任何个人、团队和企业,无论以何种方式使用该项目、对何对象提供服务,所产生的一切后果,本项目均不承担任何责任
2. 成本与安全Agent模式下Token使用量高于普通对话模式请根据效果及成本综合选择模型。Agent具有访问所在操作系统的能力请谨慎选择项目部署环境。同时项目也会持续升级安全机制、并降低模型消耗成本
1. 本项目遵循 [MIT开源协议](/LICENSE),主要用于技术研究和学习,使用本项目时需遵守所在地法律法规、相关政策以及企业章程,禁止用于任何违法或侵犯他人权益的行为。任何个人、团队和企业,无论以何种方式使用该项目、对何对象提供服务,所产生的一切后果,本项目均不承担任何责任
2. 成本与安全Agent模式下Token使用量高于普通对话模式请根据效果及成本综合选择模型。Agent具有访问所在操作系统的能力请谨慎选择项目部署环境。同时项目也会持续升级安全机制、并降低模型消耗成本
3. CowAgent项目专注于开源技术开发不会参与、授权或发行任何加密货币。
## 演示
@@ -607,10 +608,12 @@ API Key创建在 [控制台](https://aistudio.google.com/app/apikey?hl=zh-cn)
以下对可接入通道的配置方式进行说明,应用通道代码在项目的 `channel/` 目录下。
支持同时可接入多个通道,配置时可通过逗号进行分割,例如 `"channel_type": "feishu,dingtalk"`
<details>
<summary>1. Web</summary>
项目启动后默认运行Web通道,配置如下:
项目启动后默认运行Web控制台,配置如下:
```json
{

View File

@@ -158,6 +158,7 @@ class ChatService:
logger.info(f"[ChatService] Agent run completed: session={session_id}")
class _StreamState:
"""Mutable state shared between the event callback and the run method."""

View File

@@ -1,11 +1,21 @@
"""
Memory module for AgentMesh
Provides long-term memory capabilities with hybrid search (vector + keyword)
Provides both long-term memory (vector/keyword search) and short-term
conversation history persistence (SQLite).
"""
from agent.memory.manager import MemoryManager
from agent.memory.config import MemoryConfig, get_default_memory_config, set_global_memory_config
from agent.memory.embedding import create_embedding_provider
from agent.memory.conversation_store import ConversationStore, get_conversation_store
__all__ = ['MemoryManager', 'MemoryConfig', 'get_default_memory_config', 'set_global_memory_config', 'create_embedding_provider']
__all__ = [
'MemoryManager',
'MemoryConfig',
'get_default_memory_config',
'set_global_memory_config',
'create_embedding_provider',
'ConversationStore',
'get_conversation_store',
]

View File

@@ -0,0 +1,618 @@
"""
Conversation history persistence using SQLite.
Design:
- sessions table: per-session metadata (channel_type, last_active, msg_count)
- messages table: individual messages stored as JSON, append-only
- Pruning: age-based only (sessions not updated within N days are deleted)
- Thread-safe via a single in-process lock
Storage path: ~/cow/sessions/conversations.db
"""
from __future__ import annotations
import json
import sqlite3
import threading
import time
from pathlib import Path
from typing import Any, Dict, List, Optional
from common.log import logger
# ---------------------------------------------------------------------------
# Schema
# ---------------------------------------------------------------------------
_DDL = """
CREATE TABLE IF NOT EXISTS sessions (
session_id TEXT PRIMARY KEY,
channel_type TEXT NOT NULL DEFAULT '',
created_at INTEGER NOT NULL,
last_active INTEGER NOT NULL,
msg_count INTEGER NOT NULL DEFAULT 0
);
CREATE TABLE IF NOT EXISTS messages (
id INTEGER PRIMARY KEY AUTOINCREMENT,
session_id TEXT NOT NULL,
seq INTEGER NOT NULL,
role TEXT NOT NULL,
content TEXT NOT NULL,
created_at INTEGER NOT NULL,
UNIQUE (session_id, seq)
);
CREATE INDEX IF NOT EXISTS idx_messages_session
ON messages (session_id, seq);
CREATE INDEX IF NOT EXISTS idx_sessions_last_active
ON sessions (last_active);
"""
# Migration: add channel_type column to existing databases that predate it.
_MIGRATION_ADD_CHANNEL_TYPE = """
ALTER TABLE sessions ADD COLUMN channel_type TEXT NOT NULL DEFAULT '';
"""
DEFAULT_MAX_AGE_DAYS: int = 30
def _is_visible_user_message(content: Any) -> bool:
"""
Return True when a user-role message represents actual user input
(not an internal tool_result injected by the agent loop).
"""
if isinstance(content, str):
return bool(content.strip())
if isinstance(content, list):
return any(
isinstance(b, dict) and b.get("type") == "text"
for b in content
)
return False
def _extract_display_text(content: Any) -> str:
"""
Extract the human-readable text portion from a message content value.
Returns an empty string for tool_use / tool_result blocks.
"""
if isinstance(content, str):
return content.strip()
if isinstance(content, list):
parts = [
b.get("text", "")
for b in content
if isinstance(b, dict) and b.get("type") == "text"
]
return "\n".join(p for p in parts if p).strip()
return ""
def _extract_tool_calls(content: Any) -> List[Dict[str, Any]]:
"""
Extract tool_use blocks from an assistant message content.
Returns a list of {name, arguments} dicts (result filled in later).
"""
if not isinstance(content, list):
return []
return [
{"id": b.get("id", ""), "name": b.get("name", ""), "arguments": b.get("input", {})}
for b in content
if isinstance(b, dict) and b.get("type") == "tool_use"
]
def _extract_tool_results(content: Any) -> Dict[str, str]:
"""
Extract tool_result blocks from a user message, keyed by tool_use_id.
"""
if not isinstance(content, list):
return {}
results = {}
for b in content:
if not isinstance(b, dict) or b.get("type") != "tool_result":
continue
tool_id = b.get("tool_use_id", "")
result_content = b.get("content", "")
if isinstance(result_content, list):
result_content = "\n".join(
rb.get("text", "") for rb in result_content
if isinstance(rb, dict) and rb.get("type") == "text"
)
results[tool_id] = str(result_content)
return results
def _group_into_display_turns(
rows: List[tuple],
) -> List[Dict[str, Any]]:
"""
Convert raw (role, content_json, created_at) DB rows into display turns.
One display turn = one visible user message + one merged assistant reply.
All intermediate assistant messages (those carrying tool_use) and the final
assistant text reply produced for the same user query are collapsed into a
single assistant turn, exactly matching the live SSE rendering where tools
and the final answer appear inside the same bubble.
Grouping rules:
- A visible user message starts a new group.
- tool_result user messages are internal; their content is attached to the
matching tool_use entry via tool_use_id and they never become own turns.
- All assistant messages within a group are merged:
* tool_use blocks → tool_calls list (result filled from tool_results)
* text blocks → last non-empty text becomes the display content
"""
# ------------------------------------------------------------------ #
# Pass 1: split rows into groups, each starting with a visible user msg
# ------------------------------------------------------------------ #
# group = (user_row | None, [subsequent_rows])
# user_row: (content, created_at)
groups: List[tuple] = []
cur_user: Optional[tuple] = None
cur_rest: List[tuple] = []
started = False
for role, raw_content, created_at in rows:
try:
content = json.loads(raw_content)
except Exception:
content = raw_content
if role == "user" and _is_visible_user_message(content):
if started:
groups.append((cur_user, cur_rest))
cur_user = (content, created_at)
cur_rest = []
started = True
else:
cur_rest.append((role, content, created_at))
if started:
groups.append((cur_user, cur_rest))
# ------------------------------------------------------------------ #
# Pass 2: build display turns from each group
# ------------------------------------------------------------------ #
turns: List[Dict[str, Any]] = []
for user_row, rest in groups:
# User turn
if user_row:
content, created_at = user_row
text = _extract_display_text(content)
if text:
turns.append({"role": "user", "content": text, "created_at": created_at})
# Collect all tool_calls and tool_results from the rest of the group
all_tool_calls: List[Dict[str, Any]] = []
tool_results: Dict[str, str] = {}
final_text = ""
final_ts: Optional[int] = None
for role, content, created_at in rest:
if role == "user":
tool_results.update(_extract_tool_results(content))
elif role == "assistant":
tcs = _extract_tool_calls(content)
all_tool_calls.extend(tcs)
t = _extract_display_text(content)
if t:
final_text = t
final_ts = created_at
# Attach tool results to their matching tool_call entries
for tc in all_tool_calls:
tc["result"] = tool_results.get(tc.get("id", ""), "")
if final_text or all_tool_calls:
turns.append({
"role": "assistant",
"content": final_text,
"tool_calls": all_tool_calls,
"created_at": final_ts or (user_row[1] if user_row else 0),
})
return turns
class ConversationStore:
"""
SQLite-backed store for per-session conversation history.
Usage:
store = ConversationStore(db_path)
store.append_messages("user_123", new_messages, channel_type="feishu")
msgs = store.load_messages("user_123", max_turns=30)
"""
def __init__(self, db_path: Path):
self._db_path = db_path
self._lock = threading.Lock()
self._init_db()
# ------------------------------------------------------------------
# Public API
# ------------------------------------------------------------------
def load_messages(
self,
session_id: str,
max_turns: int = 30,
) -> List[Dict[str, Any]]:
"""
Load the most recent messages for a session, for injection into the LLM.
ALL message types (user text, assistant tool_use, tool_result) are returned
in their original JSON form so the LLM can reconstruct the full context.
max_turns is a *visible-turn* count: we count only user messages whose
content is actual user text (not tool_result blocks). This prevents
tool-heavy sessions from exhausting the turn budget prematurely.
Args:
session_id: Unique session identifier.
max_turns: Maximum number of visible user-assistant turns to keep.
Returns:
Chronologically ordered list of message dicts (role, content).
"""
with self._lock:
conn = self._connect()
try:
rows = conn.execute(
"""
SELECT seq, role, content
FROM messages
WHERE session_id = ?
ORDER BY seq DESC
""",
(session_id,),
).fetchall()
finally:
conn.close()
if not rows:
return []
# Walk newest-to-oldest counting *visible* user turns (actual user text,
# not tool_result injections). Record the seq of every visible user
# message so we can find a clean cut point later.
visible_turn_seqs: List[int] = [] # newest first
for seq, role, raw_content in rows:
if role != "user":
continue
try:
content = json.loads(raw_content)
except Exception:
content = raw_content
if _is_visible_user_message(content):
visible_turn_seqs.append(seq)
# Determine the seq of the oldest visible user message we want to keep.
# If the total turns fit within max_turns, keep everything.
if len(visible_turn_seqs) <= max_turns:
cutoff_seq = None # keep all
else:
# The Nth visible user message (0-indexed) is the oldest we keep.
cutoff_seq = visible_turn_seqs[max_turns - 1]
# Build result in chronological order, starting from cutoff.
# IMPORTANT: we start exactly at cutoff_seq (the visible user message),
# never mid-group, so tool_use / tool_result pairs are always complete.
result = []
for seq, role, raw_content in reversed(rows):
if cutoff_seq is not None and seq < cutoff_seq:
continue
try:
content = json.loads(raw_content)
except Exception:
content = raw_content
result.append({"role": role, "content": content})
return result
def append_messages(
self,
session_id: str,
messages: List[Dict[str, Any]],
channel_type: str = "",
) -> None:
"""
Append new messages to a session's history.
Seq numbers continue from the session's current maximum, so
concurrent callers on distinct sessions never collide.
Args:
session_id: Unique session identifier.
messages: List of message dicts to append.
channel_type: Source channel (e.g. "feishu", "web", "wechat").
Only written on session creation; ignored on update.
"""
if not messages:
return
now = int(time.time())
with self._lock:
conn = self._connect()
try:
with conn:
# INSERT OR IGNORE creates the row on first visit;
# the UPDATE always refreshes last_active.
# Avoids ON CONFLICT...DO UPDATE (requires SQLite >= 3.24).
conn.execute(
"""
INSERT OR IGNORE INTO sessions
(session_id, channel_type, created_at, last_active, msg_count)
VALUES (?, ?, ?, ?, 0)
""",
(session_id, channel_type, now, now),
)
conn.execute(
"UPDATE sessions SET last_active = ? WHERE session_id = ?",
(now, session_id),
)
# Determine starting seq for the new batch.
row = conn.execute(
"SELECT COALESCE(MAX(seq), -1) FROM messages WHERE session_id = ?",
(session_id,),
).fetchone()
next_seq = row[0] + 1
for msg in messages:
role = msg.get("role", "")
content = json.dumps(
msg.get("content", ""), ensure_ascii=False
)
conn.execute(
"""
INSERT OR IGNORE INTO messages
(session_id, seq, role, content, created_at)
VALUES (?, ?, ?, ?, ?)
""",
(session_id, next_seq, role, content, now),
)
next_seq += 1
conn.execute(
"""
UPDATE sessions
SET msg_count = (
SELECT COUNT(*) FROM messages WHERE session_id = ?
)
WHERE session_id = ?
""",
(session_id, session_id),
)
finally:
conn.close()
def clear_session(self, session_id: str) -> None:
"""Delete all messages and the session record for a given session_id."""
with self._lock:
conn = self._connect()
try:
with conn:
conn.execute(
"DELETE FROM messages WHERE session_id = ?", (session_id,)
)
conn.execute(
"DELETE FROM sessions WHERE session_id = ?", (session_id,)
)
finally:
conn.close()
def cleanup_old_sessions(self, max_age_days: Optional[int] = None) -> int:
"""
Delete sessions that have not been active within max_age_days.
Args:
max_age_days: Override the default retention period.
Returns:
Number of sessions deleted.
"""
try:
from config import conf
max_age = max_age_days or conf().get(
"conversation_max_age_days", DEFAULT_MAX_AGE_DAYS
)
except Exception:
max_age = max_age_days or DEFAULT_MAX_AGE_DAYS
cutoff = int(time.time()) - max_age * 86400
deleted = 0
with self._lock:
conn = self._connect()
try:
with conn:
stale = conn.execute(
"SELECT session_id FROM sessions WHERE last_active < ?",
(cutoff,),
).fetchall()
for (sid,) in stale:
conn.execute(
"DELETE FROM messages WHERE session_id = ?", (sid,)
)
conn.execute(
"DELETE FROM sessions WHERE session_id = ?", (sid,)
)
deleted += 1
finally:
conn.close()
if deleted:
logger.info(f"[ConversationStore] Pruned {deleted} expired sessions")
return deleted
def load_history_page(
self,
session_id: str,
page: int = 1,
page_size: int = 20,
) -> Dict[str, Any]:
"""
Load a page of conversation history for UI display, grouped into turns.
Each "turn" maps to one of:
- A user message (role="user", content=str)
- An assistant message (role="assistant", content=str,
tool_calls=[{name, arguments, result}] when tools were used)
Internal tool_result user messages are merged into the preceding
assistant entry's tool_calls list and never appear as standalone items.
Pages are numbered from 1 (most recent). Messages within a page are
returned in chronological order.
Returns:
{
"messages": [
{
"role": "user" | "assistant",
"content": str,
"tool_calls": [...], # assistant only, may be []
"created_at": int,
},
...
],
"total": <visible turn count>,
"page": <current page>,
"page_size": <page_size>,
"has_more": bool,
}
"""
page = max(1, page)
with self._lock:
conn = self._connect()
try:
rows = conn.execute(
"""
SELECT role, content, created_at
FROM messages
WHERE session_id = ?
ORDER BY seq ASC
""",
(session_id,),
).fetchall()
finally:
conn.close()
visible = _group_into_display_turns(rows)
total = len(visible)
offset = (page - 1) * page_size
page_items = list(reversed(visible))[offset: offset + page_size]
page_items = list(reversed(page_items))
return {
"messages": page_items,
"total": total,
"page": page,
"page_size": page_size,
"has_more": offset + page_size < total,
}
def get_stats(self) -> Dict[str, Any]:
"""Return basic stats keyed by channel_type, for monitoring."""
with self._lock:
conn = self._connect()
try:
total_sessions = conn.execute(
"SELECT COUNT(*) FROM sessions"
).fetchone()[0]
total_messages = conn.execute(
"SELECT COUNT(*) FROM messages"
).fetchone()[0]
by_channel = conn.execute(
"""
SELECT channel_type, COUNT(*) as cnt
FROM sessions
GROUP BY channel_type
ORDER BY cnt DESC
"""
).fetchall()
return {
"total_sessions": total_sessions,
"total_messages": total_messages,
"by_channel": {row[0] or "unknown": row[1] for row in by_channel},
}
finally:
conn.close()
# ------------------------------------------------------------------
# Internal helpers
# ------------------------------------------------------------------
def _init_db(self) -> None:
self._db_path.parent.mkdir(parents=True, exist_ok=True)
conn = self._connect()
try:
conn.executescript(_DDL)
conn.commit()
self._migrate(conn)
finally:
conn.close()
def _migrate(self, conn: sqlite3.Connection) -> None:
"""Apply incremental schema migrations on existing databases."""
cols = {
row[1]
for row in conn.execute("PRAGMA table_info(sessions)").fetchall()
}
if "channel_type" not in cols:
try:
conn.execute(_MIGRATION_ADD_CHANNEL_TYPE)
conn.commit()
logger.info("[ConversationStore] Migrated: added channel_type column")
except Exception as e:
logger.warning(f"[ConversationStore] Migration failed: {e}")
def _connect(self) -> sqlite3.Connection:
conn = sqlite3.connect(str(self._db_path), timeout=10)
conn.execute("PRAGMA journal_mode=WAL")
conn.execute("PRAGMA synchronous=NORMAL")
return conn
# ---------------------------------------------------------------------------
# Singleton
# ---------------------------------------------------------------------------
_store_instance: Optional[ConversationStore] = None
_store_lock = threading.Lock()
def get_conversation_store() -> ConversationStore:
"""
Return the process-wide ConversationStore singleton.
Reuses the long-term memory database so the project stays with a single
SQLite file: ~/cow/memory/long-term/index.db
The conversation tables (sessions / messages) are separate from the
memory tables (memory_chunks / file_metadata) — no conflicts.
"""
global _store_instance
if _store_instance is not None:
return _store_instance
with _store_lock:
if _store_instance is not None:
return _store_instance
try:
from agent.memory.config import get_default_memory_config
db_path = get_default_memory_config().get_db_path()
except Exception:
from common.utils import expand_path
db_path = Path(expand_path("~/cow")) / "memory" / "long-term" / "index.db"
_store_instance = ConversationStore(db_path)
logger.debug(f"[ConversationStore] Using shared DB at: {db_path}")
return _store_instance

View File

@@ -509,7 +509,7 @@ class MemoryStorage:
"""Destructor to ensure connection is closed"""
try:
self.close()
except:
except Exception:
pass # Ignore errors during cleanup
# Helper methods

View File

@@ -501,7 +501,7 @@ class AgentStreamExecutor:
# Prepare messages
messages = self._prepare_messages()
logger.debug(f"Sending {len(messages)} messages to LLM")
logger.info(f"Sending {len(messages)} messages to LLM")
# Prepare tool definitions (OpenAI/Claude format)
tools_schema = None
@@ -574,7 +574,7 @@ class AgentStreamExecutor:
raise Exception(f"{error_msg} (Status: {status_code}, Code: {error_code}, Type: {error_type})")
# Parse chunk
if isinstance(chunk, dict) and "choices" in chunk:
if isinstance(chunk, dict) and chunk.get("choices"):
choice = chunk["choices"][0]
delta = choice.get("delta", {})

View File

@@ -94,7 +94,7 @@ class Ls(BaseTool):
results.append(entry + '/')
else:
results.append(entry)
except:
except Exception:
# Skip entries we can't stat
continue

View File

@@ -451,8 +451,7 @@ def attach_scheduler_to_tool(tool, context: Context = None):
if context:
tool.current_context = context
# Also set channel_type from config
channel_type = conf().get("channel_type", "unknown")
channel_type = context.get("channel_type") or conf().get("channel_type", "unknown")
if not tool.config:
tool.config = {}
tool.config["channel_type"] = channel_type

View File

@@ -147,7 +147,7 @@ class SchedulerService:
return False
return now >= next_run
except:
except Exception:
return False
def _calculate_next_run(self, task: dict, from_time: datetime) -> Optional[datetime]:
@@ -195,7 +195,7 @@ class SchedulerService:
# Only return if in the future
if run_at > from_time:
return run_at
except:
except Exception:
pass
return None

View File

@@ -424,7 +424,7 @@ class SchedulerTool(BaseTool):
try:
dt = datetime.fromisoformat(run_at)
return f"一次性 ({dt.strftime('%Y-%m-%d %H:%M')})"
except:
except Exception:
return "一次性"
return "未知"
@@ -438,6 +438,6 @@ class SchedulerTool(BaseTool):
return msg.other_user_nickname or "群聊"
else:
return msg.from_user_nickname or "用户"
except:
except Exception:
pass
return "未知"

View File

@@ -72,7 +72,7 @@ class TaskStore:
with open(self.store_path, 'r') as src:
with open(backup_path, 'w') as dst:
dst.write(src.read())
except:
except Exception:
pass
# Save tasks

159
app.py
View File

@@ -13,7 +13,6 @@ from plugins import *
import threading
# Global channel manager for restart support
_channel_mgr = None
@@ -21,92 +20,130 @@ def get_channel_manager():
return _channel_mgr
def _parse_channel_type(raw) -> list:
"""
Parse channel_type config value into a list of channel names.
Supports:
- single string: "feishu"
- comma-separated string: "feishu, dingtalk"
- list: ["feishu", "dingtalk"]
"""
if isinstance(raw, list):
return [ch.strip() for ch in raw if ch.strip()]
if isinstance(raw, str):
return [ch.strip() for ch in raw.split(",") if ch.strip()]
return []
class ChannelManager:
"""
Manage the lifecycle of a channel, supporting restart from sub-threads.
The channel.startup() runs in a daemon thread so that the main thread
remains available and a new channel can be started at any time.
Manage the lifecycle of multiple channels running concurrently.
Each channel.startup() runs in its own daemon thread.
The web channel is started as default console unless explicitly disabled.
"""
def __init__(self):
self._channel = None
self._channel_thread = None
self._channels = {} # channel_name -> channel instance
self._threads = {} # channel_name -> thread
self._primary_channel = None
self._lock = threading.Lock()
@property
def channel(self):
return self._channel
"""Return the primary (first non-web) channel for backward compatibility."""
return self._primary_channel
def start(self, channel_name: str, first_start: bool = False):
def get_channel(self, channel_name: str):
return self._channels.get(channel_name)
def start(self, channel_names: list, first_start: bool = False):
"""
Create and start a channel in a sub-thread.
Create and start one or more channels in sub-threads.
If first_start is True, plugins and linkai client will also be initialized.
"""
with self._lock:
channel = channel_factory.create_channel(channel_name)
self._channel = channel
channels = []
for name in channel_names:
ch = channel_factory.create_channel(name)
self._channels[name] = ch
channels.append((name, ch))
if self._primary_channel is None and name != "web":
self._primary_channel = ch
if self._primary_channel is None and channels:
self._primary_channel = channels[0][1]
if first_start:
if channel_name in ["wx", "wxy", "terminal", "wechatmp", "web",
"wechatmp_service", "wechatcom_app", "wework",
const.FEISHU, const.DINGTALK]:
PluginManager().load_plugins()
PluginManager().load_plugins()
if conf().get("use_linkai"):
try:
from common import cloud_client
threading.Thread(target=cloud_client.start, args=(channel, self), daemon=True).start()
except Exception as e:
threading.Thread(
target=cloud_client.start,
args=(self._primary_channel, self),
daemon=True,
).start()
except Exception:
pass
# Run channel.startup() in a daemon thread so we can restart later
self._channel_thread = threading.Thread(
target=self._run_channel, args=(channel,), daemon=True
)
self._channel_thread.start()
logger.debug(f"[ChannelManager] Channel '{channel_name}' started in sub-thread")
# Start web console first so its logs print cleanly,
# then start remaining channels after a brief pause.
web_entry = None
other_entries = []
for entry in channels:
if entry[0] == "web":
web_entry = entry
else:
other_entries.append(entry)
def _run_channel(self, channel):
ordered = ([web_entry] if web_entry else []) + other_entries
for i, (name, ch) in enumerate(ordered):
if i > 0 and name != "web":
time.sleep(0.1)
t = threading.Thread(target=self._run_channel, args=(name, ch), daemon=True)
self._threads[name] = t
t.start()
logger.debug(f"[ChannelManager] Channel '{name}' started in sub-thread")
def _run_channel(self, name: str, channel):
try:
channel.startup()
except Exception as e:
logger.error(f"[ChannelManager] Channel startup error: {e}")
logger.error(f"[ChannelManager] Channel '{name}' startup error: {e}")
logger.exception(e)
def stop(self):
def stop(self, channel_name: str = None):
"""
Stop the current channel. Since most channel startup() methods block
on an HTTP server or stream client, we stop by terminating the thread.
Stop channel(s). If channel_name is given, stop only that channel;
otherwise stop all channels.
"""
with self._lock:
if self._channel is None:
return
channel_type = getattr(self._channel, 'channel_type', 'unknown')
logger.info(f"[ChannelManager] Stopping channel '{channel_type}'...")
# Try graceful stop if channel implements it
try:
if hasattr(self._channel, 'stop'):
self._channel.stop()
except Exception as e:
logger.warning(f"[ChannelManager] Error during channel stop: {e}")
self._channel = None
self._channel_thread = None
names = [channel_name] if channel_name else list(self._channels.keys())
for name in names:
ch = self._channels.pop(name, None)
self._threads.pop(name, None)
if ch is None:
continue
logger.info(f"[ChannelManager] Stopping channel '{name}'...")
try:
if hasattr(ch, 'stop'):
ch.stop()
except Exception as e:
logger.warning(f"[ChannelManager] Error during channel '{name}' stop: {e}")
if channel_name and self._primary_channel is self._channels.get(channel_name):
self._primary_channel = None
def restart(self, new_channel_name: str):
"""
Restart the channel with a new channel type.
Restart a single channel with a new channel type.
Can be called from any thread (e.g. linkai config callback).
"""
logger.info(f"[ChannelManager] Restarting channel to '{new_channel_name}'...")
self.stop()
# Clear singleton cache so a fresh channel instance is created
self.stop(new_channel_name)
_clear_singleton_cache(new_channel_name)
time.sleep(1) # Brief pause to allow resources to release
self.start(new_channel_name, first_start=False)
time.sleep(1)
self.start([new_channel_name], first_start=False)
logger.info(f"[ChannelManager] Channel restarted to '{new_channel_name}' successfully")
@@ -130,14 +167,11 @@ def _clear_singleton_cache(channel_name: str):
module_path = cls_map.get(channel_name)
if not module_path:
return
# The singleton decorator stores instances in a closure dict keyed by class.
# We need to find the actual class and clear it from the closure.
try:
parts = module_path.rsplit(".", 1)
module_name, class_name = parts[0], parts[1]
import importlib
module = importlib.import_module(module_name)
# The module-level name is the wrapper function from @singleton
wrapper = getattr(module, class_name, None)
if wrapper and hasattr(wrapper, '__closure__') and wrapper.__closure__:
for cell in wrapper.__closure__:
@@ -176,17 +210,28 @@ def run():
# kill signal
sigterm_handler_wrap(signal.SIGTERM)
# create channel
channel_name = conf().get("channel_type", "wx")
# Parse channel_type into a list
raw_channel = conf().get("channel_type", "web")
if "--cmd" in sys.argv:
channel_name = "terminal"
channel_names = ["terminal"]
else:
channel_names = _parse_channel_type(raw_channel)
if not channel_names:
channel_names = ["web"]
if channel_name == "wxy":
if "wxy" in channel_names:
os.environ["WECHATY_LOG"] = "warn"
# Auto-start web console unless explicitly disabled
web_console_enabled = conf().get("web_console", True)
if web_console_enabled and "web" not in channel_names:
channel_names.append("web")
logger.info(f"[App] Starting channels: {channel_names}")
_channel_mgr = ChannelManager()
_channel_mgr.start(channel_name, first_start=True)
_channel_mgr.start(channel_names, first_start=True)
while True:
time.sleep(1)

View File

@@ -135,7 +135,7 @@ class AgentLLMModel(LLMModel):
# Use tool-enabled streaming call if available
# Extract system prompt if present
system_prompt = getattr(request, 'system', None)
# Build kwargs for call_with_tools
kwargs = {
'messages': request.messages,
@@ -143,15 +143,20 @@ class AgentLLMModel(LLMModel):
'stream': True,
'model': self.model # Pass model parameter
}
# Only pass max_tokens if explicitly set, let the bot use its default
if request.max_tokens is not None:
kwargs['max_tokens'] = request.max_tokens
# Add system prompt if present
if system_prompt:
kwargs['system'] = system_prompt
# Pass channel_type for linkai tracking
channel_type = getattr(self, 'channel_type', None)
if channel_type:
kwargs['channel_type'] = channel_type
stream = self.bot.call_with_tools(**kwargs)
# Convert stream format to our expected format
@@ -325,6 +330,14 @@ class AgentBridge:
logger.warning(f"[AgentBridge] Failed to attach context to scheduler: {e}")
break
# Pass channel_type to model so linkai requests carry it
if context and hasattr(agent, 'model'):
agent.model.channel_type = context.get("channel_type", "")
# Record message count before execution so we can diff new messages
with agent.messages_lock:
pre_run_len = len(agent.messages)
try:
# Use agent's run_stream method with event handler
response = agent.run_stream(
@@ -336,9 +349,16 @@ class AgentBridge:
# Restore original tools
if context and context.get("is_scheduled_task"):
agent.tools = original_tools
# Log execution summary
event_handler.log_summary()
# Persist new messages generated during this run
if session_id:
channel_type = (context.get("channel_type") or "") if context else ""
with agent.messages_lock:
new_messages = agent.messages[pre_run_len:]
self._persist_messages(session_id, list(new_messages), channel_type)
# Check if there are files to send (from read tool)
if hasattr(agent, 'stream_executor') and hasattr(agent.stream_executor, 'files_to_send'):
@@ -475,6 +495,32 @@ class AgentBridge:
except Exception as e:
logger.warning(f"[AgentBridge] Failed to migrate API keys: {e}")
def _persist_messages(
self, session_id: str, new_messages: list, channel_type: str = ""
) -> None:
"""
Persist new messages to the conversation store after each agent run.
Failures are logged but never propagate — they must not interrupt replies.
"""
if not new_messages:
return
try:
from config import conf
if not conf().get("conversation_persistence", True):
return
except Exception:
pass
try:
from agent.memory import get_conversation_store
get_conversation_store().append_messages(
session_id, new_messages, channel_type=channel_type
)
except Exception as e:
logger.warning(
f"[AgentBridge] Failed to persist messages for session={session_id}: {e}"
)
def clear_session(self, session_id: str):
"""
Clear a specific session's agent and conversation history

View File

@@ -118,8 +118,47 @@ class AgentInitializer:
# Attach memory manager
if memory_manager:
agent.memory_manager = memory_manager
# Restore persisted conversation history for this session
if session_id:
self._restore_conversation_history(agent, session_id)
return agent
def _restore_conversation_history(self, agent, session_id: str) -> None:
"""
Load persisted conversation messages from SQLite and inject them
into the agent's in-memory message list.
Only runs when conversation persistence is enabled (default: True).
Respects agent_max_context_turns to limit how many turns are loaded.
"""
from config import conf
if not conf().get("conversation_persistence", True):
return
try:
from agent.memory import get_conversation_store
store = get_conversation_store()
# On restore, load at most min(10, max_turns // 2) turns so that
# a long-running session does not immediately fill the context window
# after a restart. The full max_turns budget is reserved for the
# live conversation that follows.
max_turns = conf().get("agent_max_context_turns", 30)
restore_turns = min(6, max(1, max_turns // 3))
saved = store.load_messages(session_id, max_turns=restore_turns)
if saved:
with agent.messages_lock:
agent.messages = saved
logger.debug(
f"[AgentInitializer] Restored {len(saved)} messages "
f"({restore_turns} turns cap) for session={session_id}"
)
except Exception as e:
logger.warning(
f"[AgentInitializer] Failed to restore conversation history for "
f"session={session_id}: {e}"
)
def _load_env_file(self):
"""Load environment variables from .env file"""
@@ -283,7 +322,14 @@ class AgentInitializer:
tool.scheduler_service = scheduler_service
if not tool.config:
tool.config = {}
tool.config["channel_type"] = conf().get("channel_type", "unknown")
raw_ct = conf().get("channel_type", "unknown")
if isinstance(raw_ct, list):
ct = raw_ct[0] if raw_ct else "unknown"
elif isinstance(raw_ct, str) and "," in raw_ct:
ct = raw_ct.split(",")[0].strip()
else:
ct = raw_ct
tool.config["channel_type"] = ct
except Exception as e:
logger.warning(f"[AgentInitializer] Failed to inject scheduler dependencies: {e}")
@@ -330,7 +376,7 @@ class AgentInitializer:
return {
"model": conf().get("model", "unknown"),
"workspace": workspace_root,
"channel": conf().get("channel_type", "unknown"),
"channel": ", ".join(conf().get("channel_type")) if isinstance(conf().get("channel_type"), list) else conf().get("channel_type", "unknown"),
"_get_current_time": get_current_time # Dynamic time function
}

View File

@@ -24,11 +24,16 @@ handler_pool = ThreadPoolExecutor(max_workers=8) # 处理消息的线程池
class ChatChannel(Channel):
name = None # 登录的用户名
user_id = None # 登录的用户id
futures = {} # 记录每个session_id提交到线程池的future对象, 用于重置会话时把没执行的future取消掉正在执行的不会被取消
sessions = {} # 用于控制并发每个session_id同时只能有一个context在处理
lock = threading.Lock() # 用于控制对sessions的访问
def __init__(self):
# Instance-level attributes so each channel subclass has its own
# independent session queue and lock. Previously these were class-level,
# which caused contexts from one channel (e.g. Feishu) to be consumed
# by another channel's consume() thread (e.g. Web), leading to errors
# like "No request_id found in context".
self.futures = {}
self.sessions = {}
self.lock = threading.Lock()
_thread = threading.Thread(target=self.consume)
_thread.setDaemon(True)
_thread.start()
@@ -37,9 +42,8 @@ class ChatChannel(Channel):
def _compose_context(self, ctype: ContextType, content, **kwargs):
context = Context(ctype, content)
context.kwargs = kwargs
# context首次传入时origin_ctype是None,
# 引入的起因是当输入语音时会嵌套生成两个context第一步语音转文本第二步通过文本生成文字回复。
# origin_ctype用于第二步文本回复时判断是否需要匹配前缀如果是私聊的语音就不需要匹配前缀
if "channel_type" not in context:
context["channel_type"] = self.channel_type
if "origin_ctype" not in context:
context["origin_ctype"] = ctype
# context首次传入时receiver是None根据类型设置receiver

View File

@@ -698,6 +698,8 @@ class FeiShuChanel(ChatChannel):
def _compose_context(self, ctype: ContextType, content, **kwargs):
context = Context(ctype, content)
context.kwargs = kwargs
if "channel_type" not in context:
context["channel_type"] = self.channel_type
if "origin_ctype" not in context:
context["origin_ctype"] = ctype

View File

@@ -43,8 +43,17 @@
}
</script>
<link rel="stylesheet" href="assets/css/console.css">
<!-- Apply theme/lang before first paint to avoid flash of unstyled content.
This runs synchronously in <head> so the correct class is on <html>
before any CSS or body rendering occurs. -->
<script>
(function() {
var theme = localStorage.getItem('cow_theme') || 'dark';
if (theme === 'dark') document.documentElement.classList.add('dark');
})();
</script>
</head>
<body class="h-screen overflow-hidden bg-gray-50 dark:bg-[#111111] text-slate-800 dark:text-slate-200 font-sans transition-colors duration-200">
<body class="h-screen overflow-hidden bg-gray-50 dark:bg-[#111111] text-slate-800 dark:text-slate-200 font-sans">
<div id="app" class="flex h-screen">
<!-- ================================================================ -->

View File

@@ -232,19 +232,37 @@ function renderMarkdown(text) {
// =====================================================================
// Chat Module
// =====================================================================
let sessionId = generateSessionId();
let isPolling = false;
let loadingContainers = {};
let activeStreams = {}; // request_id -> EventSource
let isComposing = false;
let appConfig = { use_agent: false, title: 'CowAgent', subtitle: '' };
const SESSION_ID_KEY = 'cow_session_id';
function generateSessionId() {
return 'session_' + ([1e7]+-1e3+-4e3+-8e3+-1e11).replace(/[018]/g, c =>
(c ^ crypto.getRandomValues(new Uint8Array(1))[0] & 15 >> c / 4).toString(16)
);
}
// Restore session_id from localStorage so conversation history survives page refresh.
// A new id is only generated when the user explicitly starts a new chat.
function loadOrCreateSessionId() {
const stored = localStorage.getItem(SESSION_ID_KEY);
if (stored) return stored;
const fresh = generateSessionId();
localStorage.setItem(SESSION_ID_KEY, fresh);
return fresh;
}
let sessionId = loadOrCreateSessionId();
// ---- Conversation history state ----
let historyPage = 0; // last page fetched (0 = nothing fetched yet)
let historyHasMore = false;
let historyLoading = false;
fetch('/config').then(r => r.json()).then(data => {
if (data.status === 'success') {
appConfig = data;
@@ -257,14 +275,20 @@ fetch('/config').then(r => r.json()).then(data => {
document.getElementById('cfg-max-steps').textContent = data.agent_max_steps || '--';
document.getElementById('cfg-channel').textContent = data.channel_type || '--';
}
}).catch(() => {});
// Load conversation history after config is ready
loadHistory(1);
}).catch(() => { loadHistory(1); });
const chatInput = document.getElementById('chat-input');
const sendBtn = document.getElementById('send-btn');
const messagesDiv = document.getElementById('chat-messages');
chatInput.addEventListener('compositionstart', () => { isComposing = true; });
chatInput.addEventListener('compositionend', () => { isComposing = false; });
// Safari fires compositionend *before* the confirming keydown event, so if we
// reset isComposing synchronously the keydown handler sees !isComposing and
// sends the message prematurely. A setTimeout(0) defers the reset until after
// keydown has been processed, fixing the Safari IME Enter-to-confirm bug.
chatInput.addEventListener('compositionend', () => { setTimeout(() => { isComposing = false; }, 0); });
chatInput.addEventListener('input', function() {
this.style.height = '42px';
@@ -530,7 +554,7 @@ function startPolling() {
poll();
}
function addUserMessage(content, timestamp) {
function createUserMessageEl(content, timestamp) {
const el = document.createElement('div');
el.className = 'flex justify-end px-4 sm:px-6 py-3';
el.innerHTML = `
@@ -541,28 +565,141 @@ function addUserMessage(content, timestamp) {
<div class="text-xs text-slate-400 dark:text-slate-500 mt-1.5 text-right">${formatTime(timestamp)}</div>
</div>
`;
return el;
}
function renderToolCallsHtml(toolCalls) {
if (!toolCalls || toolCalls.length === 0) return '';
return toolCalls.map(tc => {
const argsStr = formatToolArgs(tc.arguments || {});
const resultStr = tc.result ? escapeHtml(String(tc.result)) : '';
const hasResult = !!resultStr;
return `
<div class="agent-step agent-tool-step">
<div class="tool-header" onclick="this.parentElement.classList.toggle('expanded')">
<i class="fas fa-check text-primary-400 flex-shrink-0 tool-icon"></i>
<span class="tool-name">${escapeHtml(tc.name || '')}</span>
<i class="fas fa-chevron-right tool-chevron"></i>
</div>
<div class="tool-detail">
<div class="tool-detail-section">
<div class="tool-detail-label">Input</div>
<pre class="tool-detail-content">${argsStr}</pre>
</div>
${hasResult ? `
<div class="tool-detail-section tool-output-section">
<div class="tool-detail-label">Output</div>
<pre class="tool-detail-content">${resultStr}</pre>
</div>` : ''}
</div>
</div>`;
}).join('');
}
function createBotMessageEl(content, timestamp, requestId, toolCalls) {
const el = document.createElement('div');
el.className = 'flex gap-3 px-4 sm:px-6 py-3';
if (requestId) el.dataset.requestId = requestId;
const toolsHtml = renderToolCallsHtml(toolCalls);
el.innerHTML = `
<img src="assets/logo.jpg" alt="CowAgent" class="w-8 h-8 rounded-lg flex-shrink-0">
<div class="min-w-0 flex-1 max-w-[85%]">
<div class="bg-white dark:bg-[#1A1A1A] border border-slate-200 dark:border-white/10 rounded-2xl px-4 py-3 text-sm leading-relaxed msg-content text-slate-700 dark:text-slate-200">
${toolsHtml ? `<div class="agent-steps">${toolsHtml}</div>` : ''}
<div class="answer-content">${renderMarkdown(content)}</div>
</div>
<div class="text-xs text-slate-400 dark:text-slate-500 mt-1.5">${formatTime(timestamp)}</div>
</div>
`;
applyHighlighting(el);
return el;
}
function addUserMessage(content, timestamp) {
const el = createUserMessageEl(content, timestamp);
messagesDiv.appendChild(el);
scrollChatToBottom();
}
function addBotMessage(content, timestamp, requestId) {
const el = document.createElement('div');
el.className = 'flex gap-3 px-4 sm:px-6 py-3';
if (requestId) el.dataset.requestId = requestId;
el.innerHTML = `
<img src="assets/logo.jpg" alt="CowAgent" class="w-8 h-8 rounded-lg flex-shrink-0">
<div class="min-w-0 flex-1 max-w-[85%]">
<div class="bg-white dark:bg-[#1A1A1A] border border-slate-200 dark:border-white/10 rounded-2xl px-4 py-3 text-sm leading-relaxed msg-content text-slate-700 dark:text-slate-200">
${renderMarkdown(content)}
</div>
<div class="text-xs text-slate-400 dark:text-slate-500 mt-1.5">${formatTime(timestamp)}</div>
</div>
`;
const el = createBotMessageEl(content, timestamp, requestId);
messagesDiv.appendChild(el);
applyHighlighting(el);
scrollChatToBottom();
}
// Load conversation history from the server (page 1 = most recent messages).
// Subsequent pages prepend older messages when the user scrolls to the top.
function loadHistory(page) {
if (historyLoading) return;
historyLoading = true;
fetch(`/api/history?session_id=${encodeURIComponent(sessionId)}&page=${page}&page_size=20`)
.then(r => r.json())
.then(data => {
if (data.status !== 'success' || data.messages.length === 0) return;
const prevScrollHeight = messagesDiv.scrollHeight;
const isFirstLoad = page === 1;
// On first load, remove the welcome screen if history exists
if (isFirstLoad) {
const ws = document.getElementById('welcome-screen');
if (ws) ws.remove();
}
// Build a fragment of history message elements in chronological order
const fragment = document.createDocumentFragment();
if (data.has_more && page > 1) {
// Keep the "load more" sentinel in place (inserted below)
}
data.messages.forEach(msg => {
const hasContent = msg.content && msg.content.trim();
const hasToolCalls = msg.role === 'assistant' && msg.tool_calls && msg.tool_calls.length > 0;
if (!hasContent && !hasToolCalls) return;
const ts = new Date(msg.created_at * 1000);
const el = msg.role === 'user'
? createUserMessageEl(msg.content, ts)
: createBotMessageEl(msg.content || '', ts, null, msg.tool_calls);
fragment.appendChild(el);
});
// Prepend history above any existing messages
const sentinel = document.getElementById('history-load-more');
const insertBefore = sentinel ? sentinel.nextSibling : messagesDiv.firstChild;
messagesDiv.insertBefore(fragment, insertBefore);
// Manage the "load more" sentinel at the very top
if (data.has_more) {
if (!document.getElementById('history-load-more')) {
const btn = document.createElement('div');
btn.id = 'history-load-more';
btn.className = 'flex justify-center py-3';
btn.innerHTML = `<button class="text-xs text-slate-400 dark:text-slate-500 hover:text-primary-400 transition-colors" onclick="loadHistory(historyPage + 1)">Load earlier messages</button>`;
messagesDiv.insertBefore(btn, messagesDiv.firstChild);
}
} else {
const sentinel = document.getElementById('history-load-more');
if (sentinel) sentinel.remove();
}
historyHasMore = data.has_more;
historyPage = page;
if (isFirstLoad) {
// Use requestAnimationFrame to ensure the DOM has fully rendered
// before scrolling, otherwise scrollHeight may not reflect new content.
requestAnimationFrame(() => scrollChatToBottom());
} else {
// Restore scroll position so loading older messages doesn't jump the view
messagesDiv.scrollTop = messagesDiv.scrollHeight - prevScrollHeight;
}
})
.catch(() => {})
.finally(() => { historyLoading = false; });
}
function addLoadingIndicator() {
const el = document.createElement('div');
el.className = 'flex gap-3 px-4 sm:px-6 py-3';
@@ -586,7 +723,9 @@ function newChat() {
Object.values(activeStreams).forEach(es => { try { es.close(); } catch (_) {} });
activeStreams = {};
// Generate a fresh session and persist it so the next page load also starts clean
sessionId = generateSessionId();
localStorage.setItem(SESSION_ID_KEY, sessionId);
isPolling = false;
loadingContainers = {};
messagesDiv.innerHTML = '';
@@ -969,3 +1108,11 @@ applyTheme();
applyI18n();
document.getElementById('sidebar-version').textContent = `CowAgent ${APP_VERSION}`;
chatInput.focus();
// Re-enable color transition AFTER first paint so the theme applied in <head>
// doesn't produce an animated flash on load. The class is missing from the
// body initially; adding it here means transitions only fire on user-triggered
// theme toggles, not on page load.
requestAnimationFrame(() => {
document.body.classList.add('transition-colors', 'duration-200');
});

View File

@@ -1,9 +1,15 @@
import sys
import time
import web
import json
import logging
import mimetypes
import os
import threading
import time
import uuid
from queue import Queue, Empty
import web
from bridge.context import *
from bridge.reply import Reply, ReplyType
from channel.chat_channel import ChatChannel, check_prefix
@@ -11,20 +17,17 @@ from channel.chat_message import ChatMessage
from common.log import logger
from common.singleton import singleton
from config import conf
import os
import mimetypes
import threading
import logging
class WebMessage(ChatMessage):
def __init__(
self,
msg_id,
content,
ctype=ContextType.TEXT,
from_user_id="User",
to_user_id="Chatgpt",
other_user_id="Chatgpt",
self,
msg_id,
content,
ctype=ContextType.TEXT,
from_user_id="User",
to_user_id="Chatgpt",
other_user_id="Chatgpt",
):
self.msg_id = msg_id
self.ctype = ctype
@@ -38,7 +41,7 @@ class WebMessage(ChatMessage):
class WebChannel(ChatChannel):
NOT_SUPPORT_REPLYTYPE = [ReplyType.VOICE]
_instance = None
# def __new__(cls):
# if cls._instance is None:
# cls._instance = super(WebChannel, cls).__new__(cls)
@@ -47,12 +50,11 @@ class WebChannel(ChatChannel):
def __init__(self):
super().__init__()
self.msg_id_counter = 0
self.session_queues = {} # session_id -> Queue (fallback polling)
self.request_to_session = {} # request_id -> session_id
self.sse_queues = {} # request_id -> Queue (SSE streaming)
self.session_queues = {} # session_id -> Queue (fallback polling)
self.request_to_session = {} # request_id -> session_id
self.sse_queues = {} # request_id -> Queue (SSE streaming)
self._http_server = None
def _generate_msg_id(self):
"""生成唯一的消息ID"""
self.msg_id_counter += 1
@@ -111,6 +113,7 @@ class WebChannel(ChatChannel):
def _make_sse_callback(self, request_id: str):
"""Build an on_event callback that pushes agent stream events into the SSE queue."""
def on_event(event: dict):
if request_id not in self.sse_queues:
return
@@ -237,28 +240,28 @@ class WebChannel(ChatChannel):
data = web.data()
json_data = json.loads(data)
session_id = json_data.get('session_id')
if not session_id or session_id not in self.session_queues:
return json.dumps({"status": "error", "message": "Invalid session ID"})
# 尝试从队列获取响应,不等待
try:
# 使用peek而不是get这样如果前端没有成功处理下次还能获取到
response = self.session_queues[session_id].get(block=False)
# 返回响应包含请求ID以区分不同请求
return json.dumps({
"status": "success",
"status": "success",
"has_content": True,
"content": response["content"],
"request_id": response["request_id"],
"timestamp": response["timestamp"]
})
except Empty:
# 没有新响应
return json.dumps({"status": "success", "has_content": False})
except Exception as e:
logger.error(f"Error polling response: {e}")
return json.dumps({"status": "error", "message": str(e)})
@@ -271,9 +274,10 @@ class WebChannel(ChatChannel):
def startup(self):
port = conf().get("web_port", 9899)
# 打印可用渠道类型提示
logger.info("[WebChannel] 当前channel为web可修改 config.json 配置文件中的 channel_type 字段进行切换。全部可用类型为:")
logger.info(
"[WebChannel] 全部可用通道如下,可修改 config.json 配置文件中的 channel_type 字段进行切换,多个通道用逗号分隔:")
logger.info("[WebChannel] 1. web - 网页")
logger.info("[WebChannel] 2. terminal - 终端")
logger.info("[WebChannel] 3. feishu - 飞书")
@@ -281,16 +285,16 @@ class WebChannel(ChatChannel):
logger.info("[WebChannel] 5. wechatcom_app - 企微自建应用")
logger.info("[WebChannel] 6. wechatmp - 个人公众号")
logger.info("[WebChannel] 7. wechatmp_service - 企业公众号")
logger.info("[WebChannel] ✅ Web控制台已运行")
logger.info(f"[WebChannel] 🌐 本地访问: http://localhost:{port}")
logger.info(f"[WebChannel] 🌍 服务器访问: http://YOUR_IP:{port} (请将YOUR_IP替换为服务器IP)")
logger.info("[WebChannel] ✅ Web对话网页已运行")
# 确保静态文件目录存在
static_dir = os.path.join(os.path.dirname(__file__), 'static')
if not os.path.exists(static_dir):
os.makedirs(static_dir)
logger.debug(f"[WebChannel] Created static directory: {static_dir}")
urls = (
'/', 'RootHandler',
'/message', 'MessageHandler',
@@ -302,18 +306,19 @@ class WebChannel(ChatChannel):
'/api/memory', 'MemoryHandler',
'/api/memory/content', 'MemoryContentHandler',
'/api/scheduler', 'SchedulerHandler',
'/api/history', 'HistoryHandler',
'/api/logs', 'LogsHandler',
'/assets/(.*)', 'AssetsHandler',
)
app = web.application(urls, globals(), autoreload=False)
# 完全禁用web.py的HTTP日志输出
web.httpserver.LogMiddleware.log = lambda self, status, environ: None
# 配置web.py的日志级别为ERROR
logging.getLogger("web").setLevel(logging.ERROR)
logging.getLogger("web.httpserver").setLevel(logging.ERROR)
# Build WSGI app with middleware (same as runsimple but without print)
func = web.httpserver.StaticMiddleware(app.wsgifunc())
func = web.httpserver.LogMiddleware(func)
@@ -471,6 +476,37 @@ class SchedulerHandler:
return json.dumps({"status": "error", "message": str(e)})
class HistoryHandler:
def GET(self):
"""
Return paginated conversation history for a session.
Query params:
session_id (required)
page int, default 1 (1 = most recent messages)
page_size int, default 20
"""
web.header('Content-Type', 'application/json; charset=utf-8')
web.header('Access-Control-Allow-Origin', '*')
try:
params = web.input(session_id='', page='1', page_size='20')
session_id = params.session_id.strip()
if not session_id:
return json.dumps({"status": "error", "message": "session_id required"})
from agent.memory import get_conversation_store
store = get_conversation_store()
result = store.load_history_page(
session_id=session_id,
page=int(params.page),
page_size=int(params.page_size),
)
return json.dumps({"status": "success", **result}, ensure_ascii=False)
except Exception as e:
logger.error(f"[WebChannel] History API error: {e}")
return json.dumps({"status": "error", "message": str(e)})
class LogsHandler:
def GET(self):
"""Stream the last N lines of run.log as SSE, then tail new lines."""

View File

@@ -28,7 +28,7 @@ def check_dulwich():
except ImportError:
try:
install("dulwich")
except:
except Exception:
needwait = True
try:
import dulwich

View File

@@ -160,7 +160,8 @@ available_setting = {
# chatgpt指令自定义触发词
"clear_memory_commands": ["#清除记忆"], # 重置会话指令,必须以#开头
# channel配置
"channel_type": "", # 通道类型,支持{wx,wxy,terminal,wechatmp,wechatmp_service,wechatcom_app,dingtalk}
"channel_type": "", # 通道类型,支持多渠道同时运行。单个: "feishu",多个: "feishu, dingtalk" 或 ["feishu", "dingtalk"]。可选值: web,feishu,dingtalk,wechatmp,wechatmp_service,wechatcom_app
"web_console": True, # 是否自动启动Web控制台默认启动。设为False可禁用
"subscribe_msg": "", # 订阅消息, 支持: wechatmp, wechatmp_service, wechatcom_app
"debug": False, # 是否开启debug模式开启后会打印更多日志
"appdata_dir": "", # 数据目录
@@ -186,6 +187,7 @@ available_setting = {
"linkai_api_key": "",
"linkai_app_code": "",
"linkai_api_base": "https://api.link-ai.tech", # linkAI服务地址
"cloud_host": "client.link-ai.tech",
"minimax_api_key": "",
"Minimax_group_id": "",
"Minimax_base_url": "",
@@ -322,7 +324,7 @@ def load_config():
logger.info("[INIT] override config by environ args: {}={}".format(name, value))
try:
config[name] = eval(value)
except:
except Exception:
if value == "false":
config[name] = False
elif value == "true":

View File

@@ -23,7 +23,7 @@ Cow项目从简单的聊天机器人全面升级为超级智能助理 **CowAgent
在后续的长期对话中Agent会在需要的时候智能记录或检索记忆并对自身设定、用户偏好、记忆文件等进行不断更新总结和记录经验和教训真正实现自主思考和不断成长。
<img width="800" src="https://cdn.link-ai.tech/doc/20260203000455.png">
<img width="800" src="https://cdn.link-ai.tech/doc/20260203000455.png" />
@@ -37,14 +37,14 @@ Cow项目从简单的聊天机器人全面升级为超级智能助理 **CowAgent
针对操作系统的终端和文件的访问能力是最基础和核心的工具其他很多工具或技能都是基于基础工具进行扩展。用户可通过手机端与Agent交互操作个人电脑或服务器上的资源
<img width="800" src="https://cdn.link-ai.tech/doc/20260202181130.png">
<img width="800" src="https://cdn.link-ai.tech/doc/20260202181130.png" />
#### 1.2 编程能力
基于编程能力和系统访问能力Agent可以实现从信息搜索、图片等素材生成、编码、测试、部署、Nginx配置修改、发布的 Vibecoding 全流程通过手机端简单的一句命令完成应用的快速demo
<img width="800" src="https://cdn.link-ai.tech/doc/20260203121008.png">
<img width="800" src="https://cdn.link-ai.tech/doc/20260203121008.png" />
@@ -53,7 +53,7 @@ Cow项目从简单的聊天机器人全面升级为超级智能助理 **CowAgent
基于 scheduler 工具实现动态定时任务,支持 **一次性任务、固定时间间隔、Cron表达式** 三种形式,任务触发可选择**固定消息发送** 或 **Agent动态任务** 执行两种模式,有很高灵活性:
<img width="800" src="https://cdn.link-ai.tech/doc/20260202195402.png">
<img width="800" src="https://cdn.link-ai.tech/doc/20260202195402.png" />
同时你也可以通过自然语言快速查看和管理已有的定时任务。
@@ -62,7 +62,7 @@ Cow项目从简单的聊天机器人全面升级为超级智能助理 **CowAgent
技能所需要的秘钥存储在环境变量文件中,由 `env_config` 工具进行管理,你可以通过对话的方式更新秘钥,工具内置了安全保护和脱敏策略,会严格保护秘钥安全:
<img width="800" src="https://cdn.link-ai.tech/doc/20260202234939.png">
<img width="800" src="https://cdn.link-ai.tech/doc/20260202234939.png" />
### 3. 技能系统
@@ -77,7 +77,7 @@ Cow项目从简单的聊天机器人全面升级为超级智能助理 **CowAgent
通过 `skill-creator` 技能可以通过对话的方式快速创建技能。你可以在与Agent的写作中让他对将某个工作流程固化为技能或者把任意接口文档和示例发送给Agent让他直接完成对接
<img width="800" src="https://cdn.link-ai.tech/doc/20260202202247.png">
<img width="800" src="https://cdn.link-ai.tech/doc/20260202202247.png" />
#### 3.2 搜索和图像识别
@@ -85,7 +85,7 @@ Cow项目从简单的聊天机器人全面升级为超级智能助理 **CowAgent
- **搜索技能:** 系统内置实现了 `bocha-search`(博查搜索)的Skill依赖环境变量 `BOCHA_SEARCH_API_KEY`,可在[控制台](https://open.bochaai.com/)进行创建并发送给Agent完成配置
- **图像识别技能:** 实现了 `openai-image-vision` 插件,可使用 gpt-4.1-mini、gpt-4.1 等图像识别模型。依赖秘钥 `OPENAI_API_KEY`可通过config.json或env_config工具进行维护。
<img width="800" src="https://cdn.link-ai.tech/doc/20260202213219.png">
<img width="800" src="https://cdn.link-ai.tech/doc/20260202213219.png" />
#### 3.3 三方知识库和插件
@@ -113,7 +113,7 @@ Cow项目从简单的聊天机器人全面升级为超级智能助理 **CowAgent
Agent可根据智能体的名称和描述进行决策并通过 app_code 调用接口访问对应的应用/工作流通过该技能可以灵活访问LinkAI平台上的智能体、知识库、插件等能力实现效果如下
<img width="750" src="https://cdn.link-ai.tech/doc/20260202234350.png">
<img width="750" src="https://cdn.link-ai.tech/doc/20260202234350.png" />
注:需通过 `env_config` 配置 `LINKAI_API_KEY`或在config.json中添加 `linkai_api_key` 配置。

View File

@@ -0,0 +1,38 @@
---
title: 钉钉
description: 将 CowAgent 接入钉钉应用
---
通过钉钉开放平台创建智能机器人应用,将 CowAgent 接入钉钉。
## 一、创建应用
1. 进入 [钉钉开发者后台](https://open-dev.dingtalk.com/fe/app#/corp/app),点击 **创建应用**,填写应用信息
2. 点击添加应用能力,选择 **机器人** 能力并添加
3. 配置机器人信息后点击 **发布**
## 二、项目配置
1. 在 **凭证与基础信息** 中获取 `Client ID` 和 `Client Secret`
2. 填入 `config.json`
```json
{
"channel_type": "dingtalk",
"dingtalk_client_id": "YOUR_CLIENT_ID",
"dingtalk_client_secret": "YOUR_CLIENT_SECRET"
}
```
3. 安装依赖:
```bash
pip3 install dingtalk_stream
```
4. 启动项目后,在钉钉开发者后台点击 **事件订阅**,点击 **已完成接入,验证连接通道**,显示"连接接入成功"即表示配置完成
## 三、使用
与机器人私聊或将机器人拉入企业群中均可开启对话。

67
docs/channels/feishu.mdx Normal file
View File

@@ -0,0 +1,67 @@
---
title: 飞书
description: 将 CowAgent 接入飞书应用
---
通过自建应用将 CowAgent 接入飞书,支持 WebSocket 长连接(推荐)和 Webhook 两种事件接收模式。
## 一、创建企业自建应用
### 1. 创建应用
进入 [飞书开发平台](https://open.feishu.cn/app/),点击 **创建企业自建应用**,填写必要信息后创建。
### 2. 添加机器人能力
在 **添加应用能力** 菜单中,为应用添加 **机器人** 能力。
### 3. 配置应用权限
点击 **权限管理**,粘贴以下权限配置,全选并批量开通:
```
im:message,im:message.group_at_msg,im:message.group_at_msg:readonly,im:message.p2p_msg,im:message.p2p_msg:readonly,im:message:send_as_bot,im:resource
```
## 二、项目配置
在 **凭证与基础信息** 中获取 `App ID` 和 `App Secret`,填入 `config.json`
<Tabs>
<Tab title="WebSocket 模式(推荐)">
无需公网 IP配置如下
```json
{
"channel_type": "feishu",
"feishu_app_id": "YOUR_APP_ID",
"feishu_app_secret": "YOUR_APP_SECRET",
"feishu_event_mode": "websocket"
}
```
需安装依赖:`pip3 install lark-oapi`
</Tab>
<Tab title="Webhook 模式">
需要公网 IP配置如下
```json
{
"channel_type": "feishu",
"feishu_app_id": "YOUR_APP_ID",
"feishu_app_secret": "YOUR_APP_SECRET",
"feishu_token": "VERIFICATION_TOKEN",
"feishu_event_mode": "webhook",
"feishu_port": 9891
}
```
</Tab>
</Tabs>
## 三、配置事件订阅
1. 启动项目后,在飞书开放平台点击 **事件与回调**,选择 **长连接** 方式并保存
2. 点击 **添加事件**,搜索 "接收消息",选择 "接收消息v2.0",确认添加
3. 点击 **版本管理与发布**,创建版本并申请线上发布,审核通过后即可使用
完成后在飞书中搜索机器人名称,即可开始对话。

31
docs/channels/web.mdx Normal file
View File

@@ -0,0 +1,31 @@
---
title: Web 网页
description: 通过 Web 网页端使用 CowAgent
---
Web 是 CowAgent 的默认通道,启动后会自动运行 Web 控制台,通过浏览器即可与 Agent 对话。
## 配置
```json
{
"channel_type": "web",
"web_port": 9899
}
```
| 参数 | 说明 | 默认值 |
| --- | --- | --- |
| `channel_type` | 设为 `web` | `web` |
| `web_port` | Web 服务监听端口 | `9899` |
## 使用
启动项目后访问:
- 本地运行:`http://localhost:9899/chat`
- 服务器运行:`http://<server-ip>:9899/chat`
<Note>
请确保服务器防火墙和安全组已放行对应端口。
</Note>

View File

@@ -0,0 +1,56 @@
---
title: 微信公众号
description: 将 CowAgent 接入微信公众号
---
CowAgent 支持接入个人订阅号和企业服务号两种公众号类型。
| 类型 | 要求 | 特点 |
| --- | --- | --- |
| **个人订阅号** | 个人可申请 | 回复生成后需用户主动发消息获取 |
| **企业服务号** | 企业申请,需通过微信认证开通客服接口 | 回复生成后可主动推送给用户 |
<Note>
公众号仅支持服务器和 Docker 部署,需额外安装扩展依赖:`pip3 install -r requirements-optional.txt`
</Note>
## 一、个人订阅号
在 `config.json` 中配置:
```json
{
"channel_type": "wechatmp",
"wechatmp_app_id": "YOUR_APP_ID",
"wechatmp_app_secret": "YOUR_APP_SECRET",
"wechatmp_aes_key": "",
"wechatmp_token": "YOUR_TOKEN",
"wechatmp_port": 80
}
```
### 配置步骤
1. 在 [微信公众平台](https://mp.weixin.qq.com/) 的 **设置与开发 → 基本配置 → 服务器配置** 中获取参数
2. 启用开发者密码,将服务器 IP 加入白名单
3. 启动程序(监听 80 端口)
4. 在公众号后台 **启用服务器配置**URL 格式为 `http://{HOST}/wx`
## 二、企业服务号
与个人订阅号流程基本相同,差异如下:
1. 在公众平台申请企业服务号并完成微信认证,确认已获得 **客服接口** 权限
2. 在 `config.json` 中设置 `"channel_type": "wechatmp_service"`
3. 即使是较长耗时的回复,也可以主动推送给用户
```json
{
"channel_type": "wechatmp_service",
"wechatmp_app_id": "YOUR_APP_ID",
"wechatmp_app_secret": "YOUR_APP_SECRET",
"wechatmp_aes_key": "",
"wechatmp_token": "YOUR_TOKEN",
"wechatmp_port": 80
}
```

59
docs/channels/wecom.mdx Normal file
View File

@@ -0,0 +1,59 @@
---
title: 企业微信
description: 将 CowAgent 接入企业微信自建应用
---
通过企业微信自建应用接入 CowAgent支持企业内部人员单聊使用。
<Note>
企业微信只能使用 Docker 部署或服务器 Python 部署,不支持本地运行模式。
</Note>
## 一、准备
需要的资源:
1. 一台服务器(有公网 IP
2. 注册一个企业微信(个人也可注册,但无法认证)
3. 认证企业微信还需要对应主体备案的域名
## 二、创建企业微信应用
1. 在 [企业微信管理后台](https://work.weixin.qq.com/wework_admin/frame#profile) **我的企业** 中获取 **企业ID**
2. 切换到 **应用管理**,点击创建应用,记录 `AgentId` 和 `Secret`
3. 点击 **设置API接收**,配置应用接口:
- URL 格式为 `http://ip:port/wxcomapp`(认证企业需使用备案域名)
- 随机获取 `Token` 和 `EncodingAESKey` 并保存
## 三、配置和运行
```json
{
"channel_type": "wechatcom_app",
"wechatcom_corp_id": "YOUR_CORP_ID",
"wechatcomapp_token": "YOUR_TOKEN",
"wechatcomapp_secret": "YOUR_SECRET",
"wechatcomapp_agent_id": "YOUR_AGENT_ID",
"wechatcomapp_aes_key": "YOUR_AES_KEY",
"wechatcomapp_port": 9898
}
```
| 参数 | 说明 |
| --- | --- |
| `wechatcom_corp_id` | 企业 ID |
| `wechatcomapp_token` | API 接收配置中的 Token |
| `wechatcomapp_secret` | 应用的 Secret |
| `wechatcomapp_agent_id` | 应用的 AgentId |
| `wechatcomapp_aes_key` | API 接收配置中的 EncodingAESKey |
| `wechatcomapp_port` | 监听端口,默认 9898 |
启动程序后,回到企业微信后台保存 **消息服务器配置**,并将服务器 IP 添加到 **企业可信IP** 中。
<Warning>
如遇到配置失败1. 确保防火墙和安全组已放行端口2. 检查各参数配置是否一致3. 认证企业需配置备案域名。
</Warning>
## 四、使用
在企业微信中搜索应用名称即可直接对话。如需让外部微信用户使用,可在 **我的企业 → 微信插件** 中分享邀请关注二维码。

323
docs/docs.json Normal file
View File

@@ -0,0 +1,323 @@
{
"$schema": "https://mintlify.com/docs.json",
"name": "CowAgent",
"description": "CowAgent - AI Super Assistant powered by LLMs, with autonomous task planning, long-term memory, skills system, and multi-channel deployment.",
"theme": "mint",
"appearance": {
"default": "light"
},
"colors": {
"primary": "#35A85B",
"light": "#4ABE6E",
"dark": "#228547"
},
"logo": {
"light": "/images/logo.jpg",
"dark": "/images/logo.jpg"
},
"favicon": "/images/favicon.ico",
"navbar": {
"links": [
{
"label": "官网",
"href": "https://cowagent.ai/"
},
{
"label": "GitHub",
"href": "https://github.com/zhayujie/chatgpt-on-wechat"
}
]
},
"footer": {
"socials": {
"github": "https://github.com/zhayujie/chatgpt-on-wechat"
}
},
"navigation": {
"languages": [
{
"language": "zh",
"default": true,
"tabs": [
{
"tab": "项目介绍",
"groups": [
{
"group": "概览",
"pages": [
"intro/index",
"intro/architecture",
"intro/features"
]
}
]
},
{
"tab": "快速开始",
"groups": [
{
"group": "安装部署",
"pages": [
"guide/quick-start",
"guide/manual-install"
]
}
]
},
{
"tab": "模型",
"groups": [
{
"group": "模型配置",
"pages": [
"models/index",
"models/minimax",
"models/glm",
"models/qwen",
"models/kimi",
"models/doubao",
"models/claude",
"models/gemini",
"models/openai",
"models/deepseek",
"models/linkai"
]
}
]
},
{
"tab": "工具",
"groups": [
{
"group": "工具系统",
"pages": [
"tools/index"
]
},
{
"group": "内置工具",
"pages": [
"tools/read",
"tools/write",
"tools/edit",
"tools/ls",
"tools/bash",
"tools/send",
"tools/memory",
"tools/env-config"
]
},
{
"group": "可选工具",
"pages": [
"tools/web-search",
"tools/scheduler"
]
}
]
},
{
"tab": "技能",
"groups": [
{
"group": "技能系统",
"pages": [
"skills/index",
"skills/skill-creator"
]
},
{
"group": "内置技能",
"pages": [
"skills/image-vision",
"skills/linkai-agent",
"skills/web-fetch"
]
}
]
},
{
"tab": "记忆",
"groups": [
{
"group": "记忆系统",
"pages": [
"memory"
]
}
]
},
{
"tab": "通道",
"groups": [
{
"group": "接入渠道",
"pages": [
"channels/web",
"channels/feishu",
"channels/dingtalk",
"channels/wecom",
"channels/wechatmp"
]
}
]
},
{
"tab": "版本",
"groups": [
{
"group": "发布记录",
"pages": [
"releases/overview",
"releases/v2.0.1",
"releases/v2.0.0"
]
}
]
}
]
},
{
"language": "en",
"tabs": [
{
"tab": "Introduction",
"groups": [
{
"group": "Overview",
"pages": [
"en/intro/index",
"en/intro/architecture",
"en/intro/features"
]
}
]
},
{
"tab": "Get Started",
"groups": [
{
"group": "Installation",
"pages": [
"en/guide/quick-start",
"en/guide/manual-install"
]
}
]
},
{
"tab": "Models",
"groups": [
{
"group": "Model Configuration",
"pages": [
"en/models/index",
"en/models/minimax",
"en/models/glm",
"en/models/qwen",
"en/models/kimi",
"en/models/doubao",
"en/models/claude",
"en/models/gemini",
"en/models/openai",
"en/models/deepseek",
"en/models/linkai"
]
}
]
},
{
"tab": "Tools",
"groups": [
{
"group": "Tools System",
"pages": [
"en/tools/index"
]
},
{
"group": "Built-in Tools",
"pages": [
"en/tools/read",
"en/tools/write",
"en/tools/edit",
"en/tools/ls",
"en/tools/bash",
"en/tools/send",
"en/tools/memory",
"en/tools/env-config"
]
},
{
"group": "Optional Tools",
"pages": [
"en/tools/web-search",
"en/tools/scheduler"
]
}
]
},
{
"tab": "Skills",
"groups": [
{
"group": "Skills System",
"pages": [
"en/skills/index",
"en/skills/skill-creator"
]
},
{
"group": "Built-in Skills",
"pages": [
"en/skills/image-vision",
"en/skills/linkai-agent",
"en/skills/web-fetch"
]
}
]
},
{
"tab": "Memory",
"groups": [
{
"group": "Memory System",
"pages": [
"en/memory"
]
}
]
},
{
"tab": "Channels",
"groups": [
{
"group": "Platforms",
"pages": [
"en/channels/web",
"en/channels/feishu",
"en/channels/dingtalk",
"en/channels/wecom",
"en/channels/wechatmp"
]
}
]
},
{
"tab": "Releases",
"groups": [
{
"group": "Release Notes",
"pages": [
"en/releases/overview",
"en/releases/v2.0.1",
"en/releases/v2.0.0"
]
}
]
}
]
}
]
}
}

View File

@@ -0,0 +1,38 @@
---
title: DingTalk
description: Integrate CowAgent into DingTalk application
---
Integrate CowAgent into DingTalk by creating an intelligent robot app on the DingTalk Open Platform.
## 1. Create App
1. Go to [DingTalk Developer Console](https://open-dev.dingtalk.com/fe/app#/corp/app), click **Create App**, fill in app information
2. Click **Add App Capability**, select **Robot** capability and add
3. Configure robot information and click **Publish**
## 2. Project Configuration
1. Get `Client ID` and `Client Secret` from **Credentials & Basic Info**
2. Fill in `config.json`:
```json
{
"channel_type": "dingtalk",
"dingtalk_client_id": "YOUR_CLIENT_ID",
"dingtalk_client_secret": "YOUR_CLIENT_SECRET"
}
```
3. Install dependency:
```bash
pip3 install dingtalk_stream
```
4. After starting the project, go to DingTalk Developer Console **Event Subscription**, click **Connection verified, verify channel**. When "Connection successful" is displayed, configuration is complete
## 3. Usage
Chat privately with the robot or add it to an enterprise group to start a conversation.

View File

@@ -0,0 +1,67 @@
---
title: Feishu (Lark)
description: Integrate CowAgent into Feishu application
---
Integrate CowAgent into Feishu by creating a custom app. Supports WebSocket (recommended, no public IP required) and Webhook event receiving modes.
## 1. Create Enterprise Custom App
### 1.1 Create App
Go to [Feishu Developer Platform](https://open.feishu.cn/app/), click **Create Enterprise Custom App**, fill in the required information and create.
### 1.2 Add Bot Capability
In **Add App Capabilities**, add **Bot** capability to the app.
### 1.3 Configure App Permissions
Click **Permission Management**, paste the following permission string, select all and enable in batch:
```
im:message,im:message.group_at_msg,im:message.group_at_msg:readonly,im:message.p2p_msg,im:message.p2p_msg:readonly,im:message:send_as_bot,im:resource
```
## 2. Project Configuration
Get `App ID` and `App Secret` from **Credentials & Basic Info**, then fill in `config.json`:
<Tabs>
<Tab title="WebSocket Mode (Recommended)">
No public IP required. Configuration:
```json
{
"channel_type": "feishu",
"feishu_app_id": "YOUR_APP_ID",
"feishu_app_secret": "YOUR_APP_SECRET",
"feishu_event_mode": "websocket"
}
```
Install dependency: `pip3 install lark-oapi`
</Tab>
<Tab title="Webhook Mode">
Requires public IP. Configuration:
```json
{
"channel_type": "feishu",
"feishu_app_id": "YOUR_APP_ID",
"feishu_app_secret": "YOUR_APP_SECRET",
"feishu_token": "VERIFICATION_TOKEN",
"feishu_event_mode": "webhook",
"feishu_port": 9891
}
```
</Tab>
</Tabs>
## 3. Configure Event Subscription
1. After starting the project, go to Feishu Developer Platform **Events & Callbacks**, select **Long Connection** and save
2. Click **Add Event**, search for "Receive Message", select "Receive Message v2.0", confirm and add
3. Click **Version Management & Release**, create a version and apply for production release. After approval, you can use it
Search for the bot name in Feishu to start chatting.

31
docs/en/channels/web.mdx Normal file
View File

@@ -0,0 +1,31 @@
---
title: Web
description: Use CowAgent through the web interface
---
Web is CowAgent's default channel. The web console starts automatically after launch, allowing you to chat with the Agent through a browser.
## Configuration
```json
{
"channel_type": "web",
"web_port": 9899
}
```
| Parameter | Description | Default |
| --- | --- | --- |
| `channel_type` | Set to `web` | `web` |
| `web_port` | Web service listen port | `9899` |
## Usage
After starting the project, visit:
- Local: `http://localhost:9899/chat`
- Server: `http://<server-ip>:9899/chat`
<Note>
Ensure the server firewall and security group allow the corresponding port.
</Note>

View File

@@ -0,0 +1,54 @@
---
title: WeChat Official Account
description: Integrate CowAgent with WeChat Official Accounts
---
CowAgent supports both personal subscription accounts and enterprise service accounts.
| Type | Requirements | Features |
| --- | --- | --- |
| **Personal Subscription** | Available to individuals | Users must send a message to retrieve replies |
| **Enterprise Service** | Enterprise with verified customer service API | Can proactively push replies to users |
<Note>
Official Accounts only support server and Docker deployment. Install extended dependencies: `pip3 install -r requirements-optional.txt`
</Note>
## Personal Subscription Account
```json
{
"channel_type": "wechatmp",
"wechatmp_app_id": "YOUR_APP_ID",
"wechatmp_app_secret": "YOUR_APP_SECRET",
"wechatmp_aes_key": "",
"wechatmp_token": "YOUR_TOKEN",
"wechatmp_port": 80
}
```
### Setup Steps
1. Get parameters from [WeChat Official Account Platform](https://mp.weixin.qq.com/) under **Settings & Development → Basic Configuration → Server Configuration**
2. Enable developer secret and add server IP to the whitelist
3. Start the program (listens on port 80)
4. Enable server configuration with URL format `http://{HOST}/wx`
## Enterprise Service Account
Same setup with these differences:
1. Register an enterprise service account with verified **Customer Service API** permission
2. Set `"channel_type": "wechatmp_service"` in `config.json`
3. Replies can be proactively pushed to users
```json
{
"channel_type": "wechatmp_service",
"wechatmp_app_id": "YOUR_APP_ID",
"wechatmp_app_secret": "YOUR_APP_SECRET",
"wechatmp_aes_key": "",
"wechatmp_token": "YOUR_TOKEN",
"wechatmp_port": 80
}
```

View File

@@ -0,0 +1,59 @@
---
title: WeCom
description: Integrate CowAgent into WeCom enterprise app
---
Integrate CowAgent into WeCom through a custom enterprise app, supporting one-on-one chat for internal employees.
<Note>
WeCom only supports Docker deployment or server Python deployment. Local run mode is not supported.
</Note>
## 1. Prerequisites
Required resources:
1. A server with public IP
2. A registered WeCom account (individual registration is possible, but cannot be certified)
3. Certified WeCom requires a domain with corresponding entity filing
## 2. Create WeCom App
1. Get **Corp ID** from **My Enterprise** in [WeCom Admin Console](https://work.weixin.qq.com/wework_admin/frame#profile)
2. Switch to **Application Management**, click Create Application, record `AgentId` and `Secret`
3. Click **Set API Reception**, configure application interface:
- URL format: `http://ip:port/wxcomapp` (certified enterprises must use filed domain)
- Generate random `Token` and `EncodingAESKey` and save
## 3. Configuration and Run
```json
{
"channel_type": "wechatcom_app",
"wechatcom_corp_id": "YOUR_CORP_ID",
"wechatcomapp_token": "YOUR_TOKEN",
"wechatcomapp_secret": "YOUR_SECRET",
"wechatcomapp_agent_id": "YOUR_AGENT_ID",
"wechatcomapp_aes_key": "YOUR_AES_KEY",
"wechatcomapp_port": 9898
}
```
| Parameter | Description |
| --- | --- |
| `wechatcom_corp_id` | Corp ID |
| `wechatcomapp_token` | Token from API reception config |
| `wechatcomapp_secret` | App Secret |
| `wechatcomapp_agent_id` | App AgentId |
| `wechatcomapp_aes_key` | EncodingAESKey from API reception config |
| `wechatcomapp_port` | Listen port, default 9898 |
After starting the program, return to WeCom Admin Console to save **Message Server Configuration**, and add the server IP to **Enterprise Trusted IPs**.
<Warning>
If configuration fails: 1. Ensure firewall and security group allow the port; 2. Verify all parameters are consistent; 3. Certified enterprises must configure a filed domain.
</Warning>
## 4. Usage
Search for the app name in WeCom to start chatting. To allow external WeChat users, share the invite QR code from **My Enterprise → WeChat Plugin**.

View File

@@ -0,0 +1,113 @@
---
title: Manual Install
description: Deploy CowAgent manually (source code / Docker)
---
## Source Code Deployment
### 1. Clone the project
```bash
git clone https://github.com/zhayujie/chatgpt-on-wechat
cd chatgpt-on-wechat/
```
<Tip>
For network issues, use the mirror: https://gitee.com/zhayujie/chatgpt-on-wechat
</Tip>
### 2. Install dependencies
Core dependencies (required):
```bash
pip3 install -r requirements.txt
```
Optional dependencies (recommended):
```bash
pip3 install -r requirements-optional.txt
```
### 3. Configure
Copy the config template and edit:
```bash
cp config-template.json config.json
```
Fill in model API keys, channel type, and other settings in `config.json`. See the [model docs](/en/models/index) for details.
### 4. Run
**Local run:**
```bash
python3 app.py
```
By default, the Web service starts. Access `http://localhost:9899/chat` to chat.
**Background run on server:**
```bash
nohup python3 app.py & tail -f nohup.out
```
## Docker Deployment
Docker deployment does not require cloning source code or installing dependencies. For Agent mode, source deployment is recommended for broader system access.
<Note>
Requires [Docker](https://docs.docker.com/engine/install/) and docker-compose.
</Note>
**1. Download config**
```bash
wget https://cdn.link-ai.tech/code/cow/docker-compose.yml
```
Edit `docker-compose.yml` with your configuration.
**2. Start container**
```bash
sudo docker compose up -d
```
**3. View logs**
```bash
sudo docker logs -f chatgpt-on-wechat
```
## Core Configuration
```json
{
"channel_type": "web",
"model": "MiniMax-M2.5",
"agent": true,
"agent_workspace": "~/cow",
"agent_max_context_tokens": 40000,
"agent_max_context_turns": 30,
"agent_max_steps": 15
}
```
| Parameter | Description | Default |
| --- | --- | --- |
| `channel_type` | Channel type | `web` |
| `model` | Model name | `MiniMax-M2.5` |
| `agent` | Enable Agent mode | `true` |
| `agent_workspace` | Agent workspace path | `~/cow` |
| `agent_max_context_tokens` | Max context tokens | `40000` |
| `agent_max_context_turns` | Max context turns | `30` |
| `agent_max_steps` | Max decision steps per task | `15` |
<Tip>
Full configuration options are in the project [`config.py`](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/config.py).
</Tip>

View File

@@ -0,0 +1,39 @@
---
title: One-click Install
description: One-click install and manage CowAgent with scripts
---
The project provides scripts for one-click install, configuration, startup, and management. Script-based deployment is recommended for quick setup.
Supports Linux, macOS, and Windows. Requires Python 3.7-3.12 (3.9 recommended).
## Install Command
```bash
bash <(curl -sS https://cdn.link-ai.tech/code/cow/run.sh)
```
The script automatically performs these steps:
1. Check Python environment (requires Python 3.7+)
2. Install required tools (git, curl, etc.)
3. Clone project to `~/chatgpt-on-wechat`
4. Install Python dependencies
5. Guided configuration for AI model and channel
6. Start service
By default, the Web service starts after installation. Access `http://localhost:9899/chat` to begin chatting.
## Management Commands
After installation, use these commands to manage the service:
| Command | Description |
| --- | --- |
| `./run.sh start` | Start service |
| `./run.sh stop` | Stop service |
| `./run.sh restart` | Restart service |
| `./run.sh status` | Check run status |
| `./run.sh logs` | View real-time logs |
| `./run.sh config` | Reconfigure |
| `./run.sh update` | Update project code |

View File

@@ -0,0 +1,71 @@
---
title: Architecture
description: CowAgent 2.0 system architecture and core design
---
CowAgent 2.0 has evolved from a simple chatbot into a super intelligent assistant with Agent architecture, featuring autonomous thinking, task planning, long-term memory, and skill extensibility.
## System Architecture
CowAgent's architecture consists of the following core modules:
<img src="https://cdn.link-ai.tech/doc/68ef7b212c6f791e0e74314b912149f9-sz_5847990.png" alt="CowAgent Architecture" />
### Core Modules
| Module | Description |
| --- | --- |
| **Channels** | Message channel layer for receiving and sending messages. Supports Web, Feishu, DingTalk, WeCom, WeChat Official Account, and more |
| **Agent Core** | Agent engine including task planning, memory system, and skills engine |
| **Tools** | Tool layer for Agent to access OS resources. 10+ built-in tools |
| **Models** | Model layer with unified access to mainstream LLMs |
## Agent Mode Workflow
When Agent mode is enabled, CowAgent runs as an autonomous agent with the following workflow:
1. **Receive Message** — Receive user input through channels
2. **Understand Intent** — Analyze task requirements and context
3. **Plan Task** — Break complex tasks into multiple steps
4. **Invoke Tools** — Select and execute appropriate tools for each step
5. **Update Memory** — Store important information in long-term memory
6. **Return Result** — Send execution results back to the user
## Workspace Directory Structure
The Agent workspace is located at `~/cow` by default and stores system prompts, memory files, and skill files:
```
~/cow/
├── system.md # Agent system prompt
├── user.md # User profile
├── memory/ # Long-term memory storage
│ ├── core.md # Core memory
│ └── daily/ # Daily memory
├── skills/ # Custom skills
│ ├── skill-1/
│ └── skill-2/
└── .env # Secret keys for skills
```
## Core Configuration
Configure Agent mode parameters in `config.json`:
```json
{
"agent": true,
"agent_workspace": "~/cow",
"agent_max_context_tokens": 40000,
"agent_max_context_turns": 30,
"agent_max_steps": 15
}
```
| Parameter | Description | Default |
| --- | --- | --- |
| `agent` | Enable Agent mode | `true` |
| `agent_workspace` | Workspace path | `~/cow` |
| `agent_max_context_tokens` | Max context tokens | `40000` |
| `agent_max_context_turns` | Max context turns | `30` |
| `agent_max_steps` | Max decision steps per task | `15` |

105
docs/en/intro/features.mdx Normal file
View File

@@ -0,0 +1,105 @@
---
title: Features
description: CowAgent long-term memory, task planning, and skills system in detail
---
## 1. Long-term Memory
The memory system enables the Agent to remember important information over time. The Agent proactively stores information when users share preferences, decisions, or key facts, and automatically extracts summaries when conversations reach a certain length. Memory is divided into core memory and daily memory, with hybrid retrieval supporting both keyword search and vector search.
On first launch, the Agent proactively asks the user for key information and records it in the workspace (default `~/cow`) — including agent settings, user identity, and memory files.
In subsequent long-term conversations, the Agent intelligently stores or retrieves memory as needed, continuously updating its own settings, user preferences, and memory files, summarizing experiences and lessons learned — truly achieving autonomous thinking and continuous growth.
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260203000455.png" width="800" />
</Frame>
## 2. Task Planning and Tool Use
Tools are the core of how the Agent accesses operating system resources. The Agent intelligently selects and invokes tools based on task requirements, performing file read/write, command execution, scheduled tasks, and more. Built-in tools are implemented in the project's `agent/tools/` directory.
**Key tools:** file read/write/edit, Bash terminal, file send, scheduler, memory search, web search, environment config, and more.
### 2.1 Terminal and File Access
Access to the OS terminal and file system is the most fundamental and core capability. Many other tools and skills build on top of this. Users can interact with the Agent from a mobile device to operate resources on their personal computer or server:
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202181130.png" width="800" />
</Frame>
### 2.2 Programming Capability
Combining programming and system access, the Agent can execute the complete **Vibecoding workflow** — from information search, asset generation, coding, testing, deployment, Nginx configuration, to publishing — all triggered by a single command from your phone:
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260203121008.png" width="800" />
</Frame>
### 2.3 Scheduled Tasks
The `scheduler` tool enables dynamic scheduled tasks, supporting **one-time tasks, fixed intervals, and Cron expressions**. Tasks can be triggered as either a **fixed message send** or an **Agent dynamic task** execution:
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202195402.png" width="800" />
</Frame>
### 2.4 Environment Variable Management
Secrets required by skills are stored in an environment variable file, managed by the `env_config` tool. You can update secrets through conversation, with built-in security protection and desensitization:
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202234939.png" width="800" />
</Frame>
## 3. Skills System
The Skills system provides infinite extensibility for the Agent. Each Skill consists of a description file, execution scripts (optional), and resources (optional), describing how to complete specific types of tasks. Skills allow the Agent to follow instructions for complex workflows, invoke tools, or integrate third-party systems.
- **Built-in skills:** Located in the project's `skills/` directory, including skill creator, image recognition, LinkAI agent, web fetch, and more. Built-in skills are automatically enabled based on dependency conditions (API keys, system commands, etc.).
- **Custom skills:** Created by users through conversation, stored in the workspace (`~/cow/skills/`), capable of implementing any complex business process or third-party integration.
### 3.1 Creating Skills
The `skill-creator` skill enables rapid skill creation through conversation. You can ask the Agent to codify a workflow as a skill, or send any API documentation and examples for the Agent to complete the integration directly:
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202202247.png" width="800" />
</Frame>
### 3.2 Web Search and Image Recognition
- **Web search:** Built-in `web_search` tool, supports multiple search engines. Configure `BOCHA_API_KEY` or `LINKAI_API_KEY` to enable.
- **Image recognition:** Built-in `openai-image-vision` skill, supports `gpt-4.1-mini`, `gpt-4.1`, and other models. Requires `OPENAI_API_KEY`.
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202213219.png" width="800" />
</Frame>
### 3.3 Third-party Knowledge Bases and Plugins
The `linkai-agent` skill makes all agents on [LinkAI](https://link-ai.tech/) available as Skills for the Agent, enabling multi-agent decision making.
Configuration: set `LINKAI_API_KEY` via `env_config`, then add agent descriptions in `skills/linkai-agent/config.json`:
```json
{
"apps": [
{
"app_code": "G7z6vKwp",
"app_name": "LinkAI Customer Support",
"app_description": "Select only when the user needs help with LinkAI platform questions"
},
{
"app_code": "SFY5x7JR",
"app_name": "Content Creator",
"app_description": "Use only when the user needs to create images or videos"
}
]
}
```
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202234350.png" width="750" />
</Frame>

68
docs/en/intro/index.mdx Normal file
View File

@@ -0,0 +1,68 @@
---
title: Introduction
description: CowAgent - AI Super Assistant powered by LLMs
---
<img src="https://cdn.link-ai.tech/doc/78c5dd674e2c828642ecc0406669fed7.png" alt="CowAgent" width="600px"/>
**CowAgent** is an AI super assistant powered by LLMs with autonomous task planning, long-term memory, skills system, multimodal messages, multiple model support, and multi-platform deployment.
CowAgent can proactively think and plan tasks, operate computers and external resources, create and execute Skills, and continuously grow with long-term memory. It supports flexible switching between multiple models, handles text, voice, images, files and other multimodal messages, and can be integrated into web, Feishu, DingTalk, WeCom, and WeChat Official Account. It runs 7x24 hours on your personal computer or server.
<Card title="GitHub" icon="github" href="https://github.com/zhayujie/chatgpt-on-wechat">
github.com/zhayujie/chatgpt-on-wechat
</Card>
## Core Capabilities
<CardGroup cols={2}>
<Card title="Autonomous Task Planning" icon="brain" href="/en/intro/architecture">
Understands complex tasks and autonomously plans execution, continuously thinking and invoking tools until goals are achieved. Supports accessing file systems, terminals, browsers, schedulers, and other system resources through tools.
</Card>
<Card title="Long-term Memory" icon="database" href="/en/memory">
Automatically persists conversation memory to local files and databases, including core memory and daily memory, with keyword and vector retrieval support.
</Card>
<Card title="Skills System" icon="puzzle-piece" href="/en/skills/index">
Implements a Skills creation and execution engine with built-in skills, and supports custom Skills development through natural language conversation.
</Card>
<Card title="Multimodal Messages" icon="image" href="/en/channels/web">
Supports parsing, processing, generating, and sending text, images, voice, files, and other message types.
</Card>
<Card title="Multiple Model Support" icon="microchip" href="/en/models/index">
Supports mainstream model providers including OpenAI, Claude, Gemini, DeepSeek, MiniMax, GLM, Qwen, Kimi, Doubao, and more.
</Card>
<Card title="Multi-platform Deployment" icon="server" href="/en/channels/web">
Runs on local computers or servers, integrable into web, Feishu, DingTalk, WeChat Official Account, and WeCom applications.
</Card>
</CardGroup>
## Quick Experience
Run the following command in your terminal for one-click install, configuration, and startup:
```bash
bash <(curl -sS https://cdn.link-ai.tech/code/cow/run.sh)
```
By default, the Web service starts after running. Access `http://localhost:9899/chat` to chat in the web interface.
<CardGroup cols={2}>
<Card title="Quick Start" icon="rocket" href="/en/guide/quick-start">
Complete installation and run guide
</Card>
<Card title="Architecture" icon="sitemap" href="/en/intro/architecture">
CowAgent system architecture design
</Card>
</CardGroup>
## Disclaimer
1. This project follows the [MIT License](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/LICENSE) and is intended for technical research and learning. Users must comply with local laws, regulations, policies, and corporate bylaws. Any illegal or rights-infringing use is prohibited.
2. Agent mode consumes more tokens than normal chat mode. Choose models based on effectiveness and cost. Agent has access to the host operating system — deploy with caution.
3. CowAgent focuses on open-source development and does not participate in, authorize, or issue any cryptocurrency.
## Community
Add our assistant on WeChat to join the open-source community:
<img width="140" src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/open-community.png" />

64
docs/en/memory.mdx Normal file
View File

@@ -0,0 +1,64 @@
---
title: Memory
description: CowAgent long-term memory system
---
The memory system enables the Agent to remember important information over time, continuously accumulating experience, understanding user preferences, and truly achieving autonomous thinking and continuous growth.
## How It Works
The Agent proactively stores memory in the following scenarios:
- **When user shares important information** — Automatically identifies and stores preferences, decisions, facts, and other key information
- **When conversation reaches a certain length** — Automatically extracts summaries to prevent information loss
- **When retrieval is needed** — Intelligently searches historical memory, combining context for responses
## Memory Types
### Core Memory
Stored in `~/cow/memory/core.md`, containing long-term user preferences, important decisions, key facts, and other information that doesn't fade over time.
### Daily Memory
Stored in `~/cow/memory/daily/` directory, organized by date, recording daily conversation summaries and key events.
## First Launch
On first launch, the Agent will proactively ask the user for key information and save it to the workspace (default `~/cow`):
| File | Description |
| --- | --- |
| `system.md` | Agent system prompt and behavior settings |
| `user.md` | User identity information and preferences |
| `memory/core.md` | Core memory |
| `memory/daily/` | Daily memory directory |
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260203000455.png" width="800" />
</Frame>
## Memory Retrieval
The memory system supports hybrid retrieval modes:
- **Keyword retrieval** — Match historical memory based on keywords
- **Vector retrieval** — Semantic similarity search, finds relevant memory even with different wording
The Agent automatically triggers memory retrieval during conversation as needed, incorporating relevant historical information into context.
## Configuration
```json
{
"agent_workspace": "~/cow",
"agent_max_context_tokens": 40000,
"agent_max_context_turns": 30
}
```
| Parameter | Description | Default |
| --- | --- | --- |
| `agent_workspace` | Workspace path, memory files stored under this directory | `~/cow` |
| `agent_max_context_tokens` | Max context tokens, affects short-term memory capacity | `40000` |
| `agent_max_context_turns` | Max context turns, oldest conversations discarded when exceeded | `30` |

17
docs/en/models/claude.mdx Normal file
View File

@@ -0,0 +1,17 @@
---
title: Claude
description: Claude model configuration
---
```json
{
"model": "claude-sonnet-4-6",
"claude_api_key": "YOUR_API_KEY"
}
```
| Parameter | Description |
| --- | --- |
| `model` | Options include `claude-sonnet-4-6`, `claude-opus-4-6`, `claude-sonnet-4-5`, `claude-sonnet-4-0`, `claude-3-5-sonnet-latest`, etc. See [official models](https://docs.anthropic.com/en/docs/about-claude/models/overview) |
| `claude_api_key` | Create at [Claude Console](https://console.anthropic.com/settings/keys) |
| `claude_api_base` | Optional. Defaults to `https://api.anthropic.com/v1`. Change to use third-party proxy |

View File

@@ -0,0 +1,22 @@
---
title: DeepSeek
description: DeepSeek model configuration
---
Use OpenAI-compatible configuration:
```json
{
"model": "deepseek-chat",
"bot_type": "chatGPT",
"open_ai_api_key": "YOUR_API_KEY",
"open_ai_api_base": "https://api.deepseek.com/v1"
}
```
| Parameter | Description |
| --- | --- |
| `model` | `deepseek-chat` (DeepSeek-V3), `deepseek-reasoner` (DeepSeek-R1) |
| `bot_type` | Must be `chatGPT` (OpenAI-compatible mode) |
| `open_ai_api_key` | Create at [DeepSeek Platform](https://platform.deepseek.com/api_keys) |
| `open_ai_api_base` | DeepSeek platform BASE URL |

17
docs/en/models/doubao.mdx Normal file
View File

@@ -0,0 +1,17 @@
---
title: Doubao (ByteDance)
description: Doubao (Volcano Ark) model configuration
---
```json
{
"model": "doubao-seed-2-0-code-preview-260215",
"ark_api_key": "YOUR_API_KEY"
}
```
| Parameter | Description |
| --- | --- |
| `model` | Options include `doubao-seed-2-0-code-preview-260215`, `doubao-seed-2-0-pro-260215`, `doubao-seed-2-0-lite-260215`, etc. |
| `ark_api_key` | Create at [Volcano Ark Console](https://console.volcengine.com/ark/region:ark+cn-beijing/apikey) |
| `ark_base_url` | Optional. Defaults to `https://ark.cn-beijing.volces.com/api/v3` |

16
docs/en/models/gemini.mdx Normal file
View File

@@ -0,0 +1,16 @@
---
title: Gemini
description: Google Gemini model configuration
---
```json
{
"model": "gemini-3.1-pro-preview",
"gemini_api_key": "YOUR_API_KEY"
}
```
| Parameter | Description |
| --- | --- |
| `model` | Options include `gemini-3.1-pro-preview`, `gemini-3-flash-preview`, `gemini-3-pro-preview`, `gemini-2.5-pro`, `gemini-2.0-flash`, etc. See [official docs](https://ai.google.dev/gemini-api/docs/models) |
| `gemini_api_key` | Create at [Google AI Studio](https://aistudio.google.com/app/apikey) |

27
docs/en/models/glm.mdx Normal file
View File

@@ -0,0 +1,27 @@
---
title: GLM (Zhipu AI)
description: Zhipu AI GLM model configuration
---
```json
{
"model": "glm-5",
"zhipu_ai_api_key": "YOUR_API_KEY"
}
```
| Parameter | Description |
| --- | --- |
| `model` | Options include `glm-5`, `glm-4.7`, `glm-4-plus`, `glm-4-flash`, `glm-4-air`, etc. See [model codes](https://bigmodel.cn/dev/api/normal-model/glm-4) |
| `zhipu_ai_api_key` | Create at [Zhipu AI Console](https://www.bigmodel.cn/usercenter/proj-mgmt/apikeys) |
OpenAI-compatible configuration is also supported:
```json
{
"bot_type": "chatGPT",
"model": "glm-5",
"open_ai_api_base": "https://open.bigmodel.cn/api/paas/v4",
"open_ai_api_key": "YOUR_API_KEY"
}
```

55
docs/en/models/index.mdx Normal file
View File

@@ -0,0 +1,55 @@
---
title: Models Overview
description: Supported models and recommended choices for CowAgent
---
CowAgent supports mainstream LLMs from domestic and international providers. Model interfaces are implemented in the project's `models/` directory.
<Note>
For Agent mode, the following models are recommended based on quality and cost: MiniMax-M2.5, glm-5, kimi-k2.5, qwen3.5-plus, claude-sonnet-4-6, gemini-3.1-pro-preview
</Note>
## Configuration
Configure the model name and API key in `config.json` according to your chosen model. Each model also supports OpenAI-compatible access by setting `bot_type` to `chatGPT` and configuring `open_ai_api_base` and `open_ai_api_key`.
You can also use the [LinkAI](https://link-ai.tech) platform interface to flexibly switch between multiple models with support for knowledge base, workflows, and other Agent capabilities.
## Supported Models
<CardGroup cols={2}>
<Card title="MiniMax" href="/en/models/minimax">
MiniMax-M2.5 and other series models
</Card>
<Card title="GLM (Zhipu AI)" href="/en/models/glm">
glm-5, glm-4.7 and other series models
</Card>
<Card title="Qwen (Tongyi Qianwen)" href="/en/models/qwen">
qwen3.5-plus, qwen3-max and more
</Card>
<Card title="Kimi" href="/en/models/kimi">
kimi-k2.5, kimi-k2 and more
</Card>
<Card title="Doubao (ByteDance)" href="/en/models/doubao">
doubao-seed series models
</Card>
<Card title="Claude" href="/en/models/claude">
claude-sonnet-4-6 and more
</Card>
<Card title="Gemini" href="/en/models/gemini">
gemini-3.1-pro-preview and more
</Card>
<Card title="OpenAI" href="/en/models/openai">
gpt-4.1, o-series and more
</Card>
<Card title="DeepSeek" href="/en/models/deepseek">
deepseek-chat, deepseek-reasoner
</Card>
<Card title="LinkAI" href="/en/models/linkai">
Unified multi-model interface + knowledge base
</Card>
</CardGroup>
<Tip>
For a full list of model names, refer to the project's [`common/const.py`](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/common/const.py) file.
</Tip>

27
docs/en/models/kimi.mdx Normal file
View File

@@ -0,0 +1,27 @@
---
title: Kimi (Moonshot)
description: Kimi (Moonshot) model configuration
---
```json
{
"model": "kimi-k2.5",
"moonshot_api_key": "YOUR_API_KEY"
}
```
| Parameter | Description |
| --- | --- |
| `model` | Options include `kimi-k2.5`, `kimi-k2`, `moonshot-v1-8k`, `moonshot-v1-32k`, `moonshot-v1-128k` |
| `moonshot_api_key` | Create at [Moonshot Console](https://platform.moonshot.cn/console/api-keys) |
OpenAI-compatible configuration is also supported:
```json
{
"bot_type": "chatGPT",
"model": "kimi-k2.5",
"open_ai_api_base": "https://api.moonshot.cn/v1",
"open_ai_api_key": "YOUR_API_KEY"
}
```

23
docs/en/models/linkai.mdx Normal file
View File

@@ -0,0 +1,23 @@
---
title: LinkAI
description: Unified access to multiple models via LinkAI platform
---
The [LinkAI](https://link-ai.tech) platform lets you flexibly switch between OpenAI, Claude, Gemini, DeepSeek, Qwen, Kimi, and other models, with support for knowledge base, workflows, plugins, and other Agent capabilities.
```json
{
"use_linkai": true,
"linkai_api_key": "YOUR_API_KEY",
"linkai_app_code": "YOUR_APP_CODE"
}
```
| Parameter | Description |
| --- | --- |
| `use_linkai` | Set to `true` to enable LinkAI interface |
| `linkai_api_key` | Create at [LinkAI Console](https://link-ai.tech/console/interface) |
| `linkai_app_code` | Optional. Code of the LinkAI agent (app or workflow) |
| `model` | Leave empty to use the agent's default model. Can be switched flexibly on the platform. All models in the [model list](https://link-ai.tech/console/models) are supported |
See the [API documentation](https://docs.link-ai.tech/platform/api) for more details.

View File

@@ -0,0 +1,27 @@
---
title: MiniMax
description: MiniMax model configuration
---
```json
{
"model": "MiniMax-M2.5",
"minimax_api_key": "YOUR_API_KEY"
}
```
| Parameter | Description |
| --- | --- |
| `model` | Options include `MiniMax-M2.5`, `MiniMax-M2.1`, `MiniMax-M2.1-lightning`, `MiniMax-M2`, etc. |
| `minimax_api_key` | Create at [MiniMax Console](https://platform.minimaxi.com/user-center/basic-information/interface-key) |
OpenAI-compatible configuration is also supported:
```json
{
"bot_type": "chatGPT",
"model": "MiniMax-M2.5",
"open_ai_api_base": "https://api.minimaxi.com/v1",
"open_ai_api_key": "YOUR_API_KEY"
}
```

19
docs/en/models/openai.mdx Normal file
View File

@@ -0,0 +1,19 @@
---
title: OpenAI
description: OpenAI model configuration
---
```json
{
"model": "gpt-4.1-mini",
"open_ai_api_key": "YOUR_API_KEY",
"open_ai_api_base": "https://api.openai.com/v1"
}
```
| Parameter | Description |
| --- | --- |
| `model` | Matches the [model parameter](https://platform.openai.com/docs/models) of the OpenAI API. Supports o-series, gpt-5.2, gpt-5.1, gpt-4.1, etc. |
| `open_ai_api_key` | Create at [OpenAI Platform](https://platform.openai.com/api-keys) |
| `open_ai_api_base` | Optional. Change to use third-party proxy |
| `bot_type` | Not required for official OpenAI models. Set to `chatGPT` when using Claude or other non-OpenAI models via proxy |

27
docs/en/models/qwen.mdx Normal file
View File

@@ -0,0 +1,27 @@
---
title: Qwen (Tongyi Qianwen)
description: Tongyi Qianwen model configuration
---
```json
{
"model": "qwen3.5-plus",
"dashscope_api_key": "YOUR_API_KEY"
}
```
| Parameter | Description |
| --- | --- |
| `model` | Options include `qwen3.5-plus`, `qwen3-max`, `qwen-max`, `qwen-plus`, `qwen-turbo`, `qwq-plus`, etc. |
| `dashscope_api_key` | Create at [Bailian Console](https://bailian.console.aliyun.com/?tab=model#/api-key). See [official docs](https://bailian.console.aliyun.com/?tab=api#/api) |
OpenAI-compatible configuration is also supported:
```json
{
"bot_type": "chatGPT",
"model": "qwen3.5-plus",
"open_ai_api_base": "https://dashscope.aliyuncs.com/compatible-mode/v1",
"open_ai_api_key": "YOUR_API_KEY"
}
```

View File

@@ -0,0 +1,22 @@
---
title: Changelog
description: CowAgent version history
---
| Version | Date | Description |
| --- | --- | --- |
| [2.0.1](/en/releases/v2.0.1) | 2026.02.27 | Built-in Web Search tool, smart context management, multiple fixes |
| [2.0.0](/en/releases/v2.0.0) | 2026.02.03 | Full upgrade to AI super assistant |
| 1.7.6 | 2025.05.23 | Web Channel optimization, AgentMesh plugin |
| 1.7.5 | 2025.04.11 | DeepSeek model |
| 1.7.4 | 2024.12.13 | Gemini 2.0 model, Web Channel |
| 1.7.3 | 2024.10.31 | Stability improvements, database features |
| 1.7.2 | 2024.09.26 | One-click install script, o1 model |
| 1.7.0 | 2024.08.02 | iFlytek 4.0 model, knowledge base references |
| 1.6.9 | 2024.07.19 | gpt-4o-mini, Alibaba voice recognition |
| 1.6.8 | 2024.07.05 | Claude 3.5, Gemini 1.5 Pro |
| 1.6.0 | 2024.04.26 | Kimi integration, gpt-4-turbo upgrade |
| 1.5.0 | 2023.11.10 | gpt-4-turbo, dall-e-3, tts multimodal |
| 1.0.0 | 2022.12.12 | Project created, first ChatGPT integration |
See [GitHub Releases](https://github.com/zhayujie/chatgpt-on-wechat/releases) for full history.

View File

@@ -0,0 +1,63 @@
---
title: v2.0.0
description: CowAgent 2.0 - Full upgrade from chatbot to AI super assistant
---
CowAgent 2.0 is a comprehensive upgrade from a chatbot to an **AI super assistant** — capable of autonomous thinking and task planning, long-term memory, operating computers, and creating and executing skills.
**Release Date**: 2026.02.03 | [GitHub Release](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/2.0.0)
## Key Updates
### Agent Core
- **Complex Task Planning**: Autonomous planning with multi-turn reasoning
- **Long-term Memory**: Persistent memory with keyword and vector search
- **Built-in Tools**: 10+ tools including file ops, Bash, browser, scheduler
- **Web search**: Built-in `web_search` tool, supports multiple search engines, configure corresponding API key to use
- **Skills System**: Skill engine with built-in and custom skill support
- **Security & Cost**: Secret management, prompt controls, token limits
### Other
- **Channels**: Feishu/DingTalk WebSocket support, image/file messages
- **Models**: claude-sonnet-4-5, gemini-3-pro-preview, glm-4.7, MiniMax-M2.1, qwen3-max
- **Deployment**: One-click install, configure, run, and management script
## Long-term Memory
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260203000455.png" width="800" />
</Frame>
## Task Planning & Tools
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202181130.png" width="800" />
</Frame>
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260203121008.png" width="800" />
</Frame>
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202195402.png" width="800" />
</Frame>
## Skills System
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202202247.png" width="800" />
</Frame>
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202213219.png" width="800" />
</Frame>
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202234350.png" width="750" />
</Frame>
## Contributing
Welcome to [submit feedback](https://github.com/zhayujie/chatgpt-on-wechat/issues) and [contribute code](https://github.com/zhayujie/chatgpt-on-wechat/pulls).

View File

@@ -0,0 +1,36 @@
---
title: v2.0.1
description: CowAgent 2.0.1 - Built-in Web Search, smart context management, multiple fixes
---
**Release Date**: 2026.02.27 | [Full Changelog](https://github.com/zhayujie/chatgpt-on-wechat/compare/2.0.0..2.0.1)
## New Features
- **Built-in Web Search tool**: Integrated web search as a built-in Agent tool, reducing decision cost ([4f0ea5d](https://github.com/zhayujie/chatgpt-on-wechat/commit/4f0ea5d7568d61db91ff69c91c429e785fd1b1c2))
- **Claude Opus 4.6 model support**: Added support for Claude Opus 4.6 model ([#2661](https://github.com/zhayujie/chatgpt-on-wechat/pull/2661))
- **WeCom image recognition**: Support image message recognition in WeCom channel ([#2667](https://github.com/zhayujie/chatgpt-on-wechat/pull/2667))
## Improvements
- **Smart context management**: Resolved chat context overflow with intelligent context trimming strategy to prevent token limits ([cea7fb7](https://github.com/zhayujie/chatgpt-on-wechat/commit/cea7fb7490c53454602bf05955a0e9f059bcf0fd), [8acf2db](https://github.com/zhayujie/chatgpt-on-wechat/commit/8acf2dbdfe713b84ad74b761b7f86674b1c1904d)) [#2663](https://github.com/zhayujie/chatgpt-on-wechat/issues/2663)
- **Runtime info dynamic update**: Automatic update of timestamps and other runtime info in system prompts via dynamic functions ([#2655](https://github.com/zhayujie/chatgpt-on-wechat/pull/2655), [#2657](https://github.com/zhayujie/chatgpt-on-wechat/pull/2657))
- **Skill prompt optimization**: Improved Skill system prompt generation, simplified tool descriptions for better Agent performance ([6c21833](https://github.com/zhayujie/chatgpt-on-wechat/commit/6c218331b1f1208ea8be6bf226936d3b556ade3e))
- **GLM custom API Base URL**: Support custom API Base URL for GLM models ([#2660](https://github.com/zhayujie/chatgpt-on-wechat/pull/2660))
- **Startup script optimization**: Improved `run.sh` script interaction and configuration flow ([#2656](https://github.com/zhayujie/chatgpt-on-wechat/pull/2656))
- **Decision step logging**: Added Agent decision step logging for debugging ([cb303e6](https://github.com/zhayujie/chatgpt-on-wechat/commit/cb303e6109c50c8dfef1f5e6c1ec47223bf3cd11))
## Bug Fixes
- **Scheduler memory loss**: Fixed memory loss caused by Scheduler dispatcher ([a77a874](https://github.com/zhayujie/chatgpt-on-wechat/commit/a77a8741b500a408c6f5c8868856fb4b018fe9db))
- **Empty tool calls & long results**: Fixed handling of empty tool calls and excessively long tool results ([0542700](https://github.com/zhayujie/chatgpt-on-wechat/commit/0542700f9091ebb08c1a56103b0f0f45f24aa621))
- **OpenAI Function Call**: Fixed function call compatibility with OpenAI models ([158c87a](https://github.com/zhayujie/chatgpt-on-wechat/commit/158c87ab8b05bae054cc1b4eacdbb64fc1062ba9))
- **Claude tool name field**: Removed extraneous tool name field from Claude model responses ([eec10cb](https://github.com/zhayujie/chatgpt-on-wechat/commit/eec10cb5db6a3d5bc12ef606606532237d2c5f6e))
- **MiniMax reasoning**: Optimized MiniMax model reasoning content handling, hidden thinking process output ([c72cda3](https://github.com/zhayujie/chatgpt-on-wechat/commit/c72cda33864bd1542012ee6e0a8bd8c6c88cb5ed), [72b1cac](https://github.com/zhayujie/chatgpt-on-wechat/commit/72b1cacea1ba0d1f3dedacbab2e088e98fd7e172))
- **GLM thinking process**: Hidden GLM model thinking process display ([72b1cac](https://github.com/zhayujie/chatgpt-on-wechat/commit/72b1cacea1ba0d1f3dedacbab2e088e98fd7e172))
- **Feishu connection & SSL**: Fixed Feishu channel SSL certificate errors and connection issues ([229b14b](https://github.com/zhayujie/chatgpt-on-wechat/commit/229b14b6fcabe7123d53cab1dea39f38dab26d6d), [8674421](https://github.com/zhayujie/chatgpt-on-wechat/commit/867442155e7f095b4f38b0856f8c1d8312b5fcf7))
- **model_type validation**: Fixed `AttributeError` caused by non-string `model_type` ([#2666](https://github.com/zhayujie/chatgpt-on-wechat/pull/2666))
## Platform Compatibility
- **Windows compatibility**: Fixed path handling, file encoding, and `os.getuid()` unavailability on Windows across multiple tool modules ([051ffd7](https://github.com/zhayujie/chatgpt-on-wechat/commit/051ffd78a372f71a967fd3259e37fe19131f83cf), [5264f7c](https://github.com/zhayujie/chatgpt-on-wechat/commit/5264f7ce18360ee4db5dcb4ebe67307977d40014))

View File

@@ -0,0 +1,33 @@
---
title: Image Vision
description: Recognize images using OpenAI vision models
---
# openai-image-vision
Analyze image content using OpenAI's GPT-4 Vision API, understanding objects, text, colors, and other elements in images.
## Dependencies
| Dependency | Description |
| --- | --- |
| `OPENAI_API_KEY` | OpenAI API key |
| `curl`, `base64` | System commands (usually pre-installed) |
Configuration:
- Configure `OPENAI_API_KEY` via the `env_config` tool
- Or set `open_ai_api_key` in `config.json`
## Supported Models
- `gpt-4.1-mini` (recommended, cost-effective)
- `gpt-4.1`
## Usage
Once configured, send an image to the Agent to automatically trigger image recognition.
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202213219.png" width="800" />
</Frame>

67
docs/en/skills/index.mdx Normal file
View File

@@ -0,0 +1,67 @@
---
title: Skills Overview
description: CowAgent skills system introduction
---
Skills provide infinite extensibility for the Agent. Each Skill consists of a description file (`SKILL.md`), execution scripts (optional), and resources (optional), describing how to accomplish specific types of tasks.
The difference between Skills and Tools: Tools are atomic operations implemented in code (e.g., file read/write, command execution), while Skills are high-level workflows based on description files that can combine multiple Tools to complete complex tasks.
## Built-in Skills
Located in the project `skills/` directory, automatically enabled based on dependency conditions:
| Skill | Description | Dependencies |
| --- | --- | --- |
| [`skill-creator`](/en/skills/skill-creator) | Create custom skills through conversation | None |
| [`openai-image-vision`](/en/skills/image-vision) | Recognize images using OpenAI vision models | `OPENAI_API_KEY` |
| [`linkai-agent`](/en/skills/linkai-agent) | Integrate LinkAI platform agents | `LINKAI_API_KEY` |
| [`web-fetch`](/en/skills/web-fetch) | Fetch web page text content | `curl` (enabled by default) |
## Custom Skills
Created by users through conversation, stored in workspace (`~/cow/skills/`), can implement any complex business process and third-party system integration.
## Skill Loading Priority
1. **Workspace skills** (highest): `~/cow/skills/`
2. **Project built-in skills** (lowest): `skills/`
Skills with the same name are overridden by priority.
## Skill File Structure
```
skills/
├── my-skill/
│ ├── SKILL.md # Skill description (frontmatter + instructions)
│ ├── scripts/ # Execution scripts (optional)
│ └── resources/ # Additional resources (optional)
```
### SKILL.md Format
```markdown
---
name: my-skill
description: Brief description of the skill
metadata:
emoji: 🔧
requires:
bins: ["curl"]
env: ["MY_API_KEY"]
primaryEnv: "MY_API_KEY"
---
# My Skill
Detailed instructions...
```
| Field | Description |
| --- | --- |
| `name` | Skill name, must match directory name |
| `description` | Skill description, Agent decides whether to invoke based on this |
| `metadata.requires.bins` | Required system commands |
| `metadata.requires.env` | Required environment variables |
| `metadata.always` | Always load (default false) |

View File

@@ -0,0 +1,49 @@
---
title: LinkAI Agent
description: Integrate LinkAI platform multi-agent skill
---
# linkai-agent
Use agents from the [LinkAI](https://link-ai.tech/) platform as Skills for multi-agent decision-making. The Agent intelligently selects based on agent names and descriptions, calling the corresponding application or workflow via `app_code`.
## Dependencies
| Dependency | Description |
| --- | --- |
| `LINKAI_API_KEY` | LinkAI platform API key, created in [Console](https://link-ai.tech/console/interface) |
| `curl` | System command (usually pre-installed) |
Configuration:
- Configure `LINKAI_API_KEY` via the `env_config` tool
- Or set `linkai_api_key` in `config.json`
## Configure Agents
Add available agents in `skills/linkai-agent/config.json`:
```json
{
"apps": [
{
"app_code": "G7z6vKwp",
"app_name": "LinkAI Customer Support",
"app_description": "Select this assistant only when the user needs help with LinkAI platform questions"
},
{
"app_code": "SFY5x7JR",
"app_name": "Content Creator",
"app_description": "Use this assistant only when the user needs to create images or videos"
}
]
}
```
## Usage
Once configured, the Agent will automatically select the appropriate LinkAI agent based on the user's question.
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202234350.png" width="750" />
</Frame>

View File

@@ -0,0 +1,33 @@
---
title: Skill Creator
description: Create custom skills through conversation
---
# skill-creator
Quickly create, install, or update skills through natural language conversation.
## Dependencies
No extra dependencies, always available.
## Usage
- Codify workflows as skills: "Create a skill from this deployment process"
- Integrate third-party APIs: "Create a skill based on this API documentation"
- Install remote skills: "Install xxx skill for me"
## Creation Flow
1. Tell the Agent what skill you want to create
2. Agent automatically generates `SKILL.md` description and execution scripts
3. Skill is saved to the workspace `~/cow/skills/` directory
4. Agent will automatically recognize and use the skill in future conversations
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202202247.png" width="800" />
</Frame>
<Tip>
See the [Skill Creator documentation](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/skills/skill-creator/SKILL.md) for details.
</Tip>

View File

@@ -0,0 +1,33 @@
---
title: Web Fetch
description: Fetch web page text content
---
# web-fetch
Use curl to fetch web pages and extract readable text content. A lightweight web access method without browser automation.
## Dependencies
| Dependency | Description |
| --- | --- |
| `curl` | System command (usually pre-installed) |
This skill has `always: true` set, enabled by default as long as the system has the `curl` command.
## Usage
Automatically invoked when the Agent needs to fetch content from a URL, no extra configuration needed.
## Comparison with browser Tool
| Feature | web-fetch (skill) | browser (tool) |
| --- | --- | --- |
| Dependencies | curl only | browser-use + playwright |
| JS rendering | Not supported | Supported |
| Page interaction | Not supported | Supports click, type, etc. |
| Best for | Static page text | Dynamic web pages |
<Tip>
For most web content retrieval scenarios, web-fetch is sufficient. Only use the browser tool when you need JS rendering or page interaction.
</Tip>

30
docs/en/tools/bash.mdx Normal file
View File

@@ -0,0 +1,30 @@
---
title: bash - Terminal
description: Execute system commands
---
# bash
Execute Bash commands in the current working directory, returns stdout and stderr. API keys configured via `env_config` are automatically injected into the environment.
## Dependencies
No extra dependencies, available by default.
## Parameters
| Parameter | Type | Required | Description |
| --- | --- | --- | --- |
| `command` | string | Yes | Command to execute |
| `timeout` | integer | No | Timeout in seconds |
## Use Cases
- Install packages and dependencies
- Run code and tests
- Deploy applications and services (Nginx config, process management, etc.)
- System administration and troubleshooting
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260203121008.png" width="800" />
</Frame>

27
docs/en/tools/browser.mdx Normal file
View File

@@ -0,0 +1,27 @@
---
title: browser - Browser
description: Access and interact with web pages
---
# browser
Use a browser to access and interact with web pages, supports JavaScript-rendered dynamic pages.
## Dependencies
| Dependency | Install Command |
| --- | --- |
| `browser-use` ≥ 0.1.40 | `pip install browser-use` |
| `markdownify` | `pip install markdownify` |
| `playwright` + chromium | `pip install playwright && playwright install chromium` |
## Use Cases
- Access specific URLs to get page content
- Interact with web page elements (click, type, etc.)
- Verify deployed web pages
- Scrape dynamic content requiring JS rendering
<Note>
The browser tool has heavy dependencies. If not needed, skip installation. For lightweight web content retrieval, use the `web-fetch` skill instead.
</Note>

26
docs/en/tools/edit.mdx Normal file
View File

@@ -0,0 +1,26 @@
---
title: edit - File Edit
description: Edit files via precise text replacement
---
# edit
Edit files via precise text replacement. If `oldText` is empty, appends to the end of the file.
## Dependencies
No extra dependencies, available by default.
## Parameters
| Parameter | Type | Required | Description |
| --- | --- | --- | --- |
| `path` | string | Yes | File path |
| `oldText` | string | Yes | Original text to replace (empty to append) |
| `newText` | string | Yes | Replacement text |
## Use Cases
- Modify specific parameters in configuration files
- Fix bugs in code
- Insert content at specific positions in files

View File

@@ -0,0 +1,38 @@
---
title: env_config - Environment
description: Manage API keys and secrets
---
# env_config
Manage environment variables (API keys and secrets) in the workspace `.env` file, with secure conversational updates. Built-in security protection and desensitization.
## Dependencies
| Dependency | Install Command |
| --- | --- |
| `python-dotenv` ≥ 1.0.0 | `pip install python-dotenv>=1.0.0` |
Included when installing optional dependencies: `pip3 install -r requirements-optional.txt`
## Parameters
| Parameter | Type | Required | Description |
| --- | --- | --- | --- |
| `action` | string | Yes | Operation type: `get`, `set`, `list`, `delete` |
| `key` | string | No | Environment variable name |
| `value` | string | No | Environment variable value (only for `set`) |
## Usage
Tell the Agent what key you need to configure, and it will automatically invoke this tool:
- "Configure my BOCHA_API_KEY"
- "Set OPENAI_API_KEY to sk-xxx"
- "Show configured environment variables"
Configured keys are automatically injected into the `bash` tool's execution environment.
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202234939.png" width="800" />
</Frame>

50
docs/en/tools/index.mdx Normal file
View File

@@ -0,0 +1,50 @@
---
title: Tools Overview
description: CowAgent built-in tools system
---
Tools are the core capability for Agent to access operating system resources. The Agent intelligently selects and invokes tools based on task requirements, performing file operations, command execution, web search, scheduled tasks, and more. Tools are implemented in the `agent/tools/` directory.
## Built-in Tools
The following tools are available by default with no extra configuration:
<CardGroup cols={2}>
<Card title="read - File Read" icon="file" href="/en/tools/read">
Read file content, supports text, images, PDF
</Card>
<Card title="write - File Write" icon="pen" href="/en/tools/write">
Create or overwrite files
</Card>
<Card title="edit - File Edit" icon="pen-to-square" href="/en/tools/edit">
Edit files via precise text replacement
</Card>
<Card title="ls - Directory List" icon="folder-open" href="/en/tools/ls">
List directory contents
</Card>
<Card title="bash - Terminal" icon="terminal" href="/en/tools/bash">
Execute system commands
</Card>
<Card title="send - File Send" icon="paper-plane" href="/en/tools/send">
Send files or images to user
</Card>
<Card title="memory - Memory" icon="brain" href="/en/tools/memory">
Search and read long-term memory
</Card>
</CardGroup>
## Optional Tools
The following tools require additional dependencies or API key configuration:
<CardGroup cols={2}>
<Card title="env_config - Environment" icon="key" href="/en/tools/env-config">
Manage API keys and secrets
</Card>
<Card title="scheduler - Scheduler" icon="clock" href="/en/tools/scheduler">
Create and manage scheduled tasks
</Card>
<Card title="web_search - Web Search" icon="magnifying-glass" href="/en/tools/web-search">
Search the internet for real-time information
</Card>
</CardGroup>

25
docs/en/tools/ls.mdx Normal file
View File

@@ -0,0 +1,25 @@
---
title: ls - Directory List
description: List directory contents
---
# ls
List directory contents, sorted alphabetically, directories suffixed with `/`, includes hidden files.
## Dependencies
No extra dependencies, available by default.
## Parameters
| Parameter | Type | Required | Description |
| --- | --- | --- | --- |
| `path` | string | Yes | Directory path, relative paths are based on workspace directory |
| `limit` | integer | No | Maximum entries to return, default 500 |
## Use Cases
- Browse project structure
- Find specific files
- Check if a directory exists

38
docs/en/tools/memory.mdx Normal file
View File

@@ -0,0 +1,38 @@
---
title: memory - Memory
description: Search and read long-term memory
---
# memory
The memory tool contains two sub-tools: `memory_search` (search memory) and `memory_get` (read memory files).
## Dependencies
No extra dependencies, available by default. Managed by the Agent Core memory system.
## memory_search
Search historical memory with hybrid keyword and vector retrieval.
| Parameter | Type | Required | Description |
| --- | --- | --- | --- |
| `query` | string | Yes | Search query |
## memory_get
Read the content of a specific memory file.
| Parameter | Type | Required | Description |
| --- | --- | --- | --- |
| `path` | string | Yes | Relative path to memory file (e.g. `MEMORY.md`, `memory/2026-01-01.md`) |
| `start_line` | integer | No | Start line number |
| `end_line` | integer | No | End line number |
## How It Works
The Agent automatically invokes memory tools in these scenarios:
- When the user shares important information → stores to memory
- When historical context is needed → searches relevant memory
- When conversation reaches a certain length → extracts summary for storage

26
docs/en/tools/read.mdx Normal file
View File

@@ -0,0 +1,26 @@
---
title: read - File Read
description: Read file content
---
# read
Read file content. Supports text files, PDF files, images (returns metadata), and more.
## Dependencies
No extra dependencies, available by default.
## Parameters
| Parameter | Type | Required | Description |
| --- | --- | --- | --- |
| `path` | string | Yes | File path, relative paths are based on workspace directory |
| `offset` | integer | No | Start line number (1-indexed), negative values read from the end |
| `limit` | integer | No | Number of lines to read |
## Use Cases
- View configuration files, log files
- Read code files for analysis
- Check image/video file info

View File

@@ -0,0 +1,42 @@
---
title: scheduler - Scheduler
description: Create and manage scheduled tasks
---
# scheduler
Create and manage dynamic scheduled tasks with flexible scheduling and execution modes.
## Dependencies
| Dependency | Install Command |
| --- | --- |
| `croniter` ≥ 2.0.0 | `pip install croniter>=2.0.0` |
Included in core dependencies: `pip3 install -r requirements.txt`
## Scheduling Modes
| Mode | Description |
| --- | --- |
| One-time | Execute once at a specified time |
| Fixed interval | Repeat at fixed time intervals |
| Cron expression | Define complex schedules using Cron syntax |
## Execution Modes
- **Fixed message**: Send a preset message when triggered
- **Agent dynamic task**: Agent intelligently executes the task when triggered
## Usage
Create and manage scheduled tasks with natural language:
- "Send me a weather report every morning at 9 AM"
- "Check server status every 2 hours"
- "Remind me about the meeting tomorrow at 3 PM"
- "Show all scheduled tasks"
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202195402.png" width="800" />
</Frame>

25
docs/en/tools/send.mdx Normal file
View File

@@ -0,0 +1,25 @@
---
title: send - File Send
description: Send files to user
---
# send
Send files to the user (images, videos, audio, documents, etc.), used when the user explicitly requests to send/share a file.
## Dependencies
No extra dependencies, available by default.
## Parameters
| Parameter | Type | Required | Description |
| --- | --- | --- | --- |
| `path` | string | Yes | File path, can be absolute or relative to workspace |
| `message` | string | No | Accompanying message |
## Use Cases
- Send generated code or documents to the user
- Send screenshots, charts
- Share downloaded files

View File

@@ -0,0 +1,34 @@
---
title: web_search - Web Search
description: Search the internet for real-time information
---
# web_search
Search the internet for real-time information, news, research, and more. Supports two search backends with automatic fallback.
## Dependencies
Requires at least one search API key (configured via `env_config` tool or workspace `.env` file):
| Backend | Environment Variable | Priority | How to Get |
| --- | --- | --- | --- |
| Bocha Search | `BOCHA_API_KEY` | Primary | [Bocha Open Platform](https://open.bochaai.com/) |
| LinkAI Search | `LINKAI_API_KEY` | Fallback | [LinkAI Console](https://link-ai.tech/console/interface) |
## Parameters
| Parameter | Type | Required | Description |
| --- | --- | --- | --- |
| `query` | string | Yes | Search keywords |
| `count` | integer | No | Number of results (1-50, default 10) |
| `freshness` | string | No | Time range: `noLimit`, `oneDay`, `oneWeek`, `oneMonth`, `oneYear`, or date range like `2025-01-01..2025-02-01` |
| `summary` | boolean | No | Return page summaries (default false) |
## Use Cases
When the user asks about latest information, needs fact-checking, or real-time data, the Agent automatically invokes this tool.
<Note>
If no search API key is configured, this tool will not be loaded.
</Note>

29
docs/en/tools/write.mdx Normal file
View File

@@ -0,0 +1,29 @@
---
title: write - File Write
description: Create or overwrite files
---
# write
Write content to a file. Creates the file if it doesn't exist, overwrites if it does. Automatically creates parent directories.
## Dependencies
No extra dependencies, available by default.
## Parameters
| Parameter | Type | Required | Description |
| --- | --- | --- | --- |
| `path` | string | Yes | File path |
| `content` | string | Yes | Content to write |
## Use Cases
- Create new code files or scripts
- Generate configuration files
- Save processing results
<Note>
Single writes should not exceed 10KB. For large files, create a skeleton first, then use the edit tool to add content in chunks.
</Note>

View File

@@ -0,0 +1,113 @@
---
title: 手动安装
description: 手动部署 CowAgent源码 / Docker
---
## 源码部署
### 1. 克隆项目代码
```bash
git clone https://github.com/zhayujie/chatgpt-on-wechat
cd chatgpt-on-wechat/
```
<Tip>
若遇到网络问题可使用国内仓库地址https://gitee.com/zhayujie/chatgpt-on-wechat
</Tip>
### 2. 安装依赖
核心依赖(必选):
```bash
pip3 install -r requirements.txt
```
扩展依赖(可选,建议安装):
```bash
pip3 install -r requirements-optional.txt
```
### 3. 配置
复制配置文件模板并编辑:
```bash
cp config-template.json config.json
```
在 `config.json` 中填写模型 API Key 和通道类型等配置,详细说明参考各 [模型文档](/models/minimax)。
### 4. 运行
**本地运行:**
```bash
python3 app.py
```
运行后默认启动 Web 服务,访问 `http://localhost:9899/chat` 开始对话。
**服务器后台运行:**
```bash
nohup python3 app.py & tail -f nohup.out
```
## Docker 部署
使用 Docker 部署无需下载源码和安装依赖。Agent 模式下更推荐使用源码部署以获得更多系统访问能力。
<Note>
需要安装 [Docker](https://docs.docker.com/engine/install/) 和 docker-compose。
</Note>
**1. 下载配置文件**
```bash
wget https://cdn.link-ai.tech/code/cow/docker-compose.yml
```
打开 `docker-compose.yml` 填写所需配置。
**2. 启动容器**
```bash
sudo docker compose up -d
```
**3. 查看日志**
```bash
sudo docker logs -f chatgpt-on-wechat
```
## 核心配置项
```json
{
"channel_type": "web",
"model": "MiniMax-M2.5",
"agent": true,
"agent_workspace": "~/cow",
"agent_max_context_tokens": 40000,
"agent_max_context_turns": 30,
"agent_max_steps": 15
}
```
| 参数 | 说明 | 默认值 |
| --- | --- | --- |
| `channel_type` | 接入渠道类型 | `web` |
| `model` | 模型名称 | `MiniMax-M2.5` |
| `agent` | 是否启用 Agent 模式 | `true` |
| `agent_workspace` | Agent 工作空间路径 | `~/cow` |
| `agent_max_context_tokens` | 最大上下文 tokens | `40000` |
| `agent_max_context_turns` | 最大上下文记忆轮次 | `30` |
| `agent_max_steps` | 单次任务最大决策步数 | `15` |
<Tip>
全部配置项可在项目 [`config.py`](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/config.py) 文件中查看。
</Tip>

View File

@@ -0,0 +1,39 @@
---
title: 一键安装
description: 使用脚本一键安装和管理 CowAgent
---
项目提供了一键安装、配置、启动、管理程序的脚本,推荐使用脚本快速运行。
支持 Linux、macOS、Windows 操作系统,需安装 Python 3.7 ~ 3.12(推荐 3.9)。
## 安装命令
```bash
bash <(curl -sS https://cdn.link-ai.tech/code/cow/run.sh)
```
脚本自动执行以下流程:
1. 检查 Python 环境(需要 Python 3.7+
2. 安装必要工具git、curl 等)
3. 克隆项目代码到 `~/chatgpt-on-wechat`
4. 安装 Python 依赖
5. 引导配置 AI 模型和通信渠道
6. 启动服务
运行后默认启动 Web 服务,访问 `http://localhost:9899/chat` 开始对话。
## 管理命令
安装完成后,可使用以下命令管理服务:
| 命令 | 说明 |
| --- | --- |
| `./run.sh start` | 启动服务 |
| `./run.sh stop` | 停止服务 |
| `./run.sh restart` | 重启服务 |
| `./run.sh status` | 查看运行状态 |
| `./run.sh logs` | 查看实时日志 |
| `./run.sh config` | 重新配置 |
| `./run.sh update` | 更新项目代码 |

BIN
docs/images/favicon.ico Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.2 KiB

BIN
docs/images/logo.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 21 KiB

View File

@@ -0,0 +1,71 @@
---
title: 项目架构
description: CowAgent 2.0 的系统架构和核心设计
---
CowAgent 2.0 从简单的聊天机器人全面升级为超级智能助理,采用 Agent 架构设计,具备自主思考、规划任务、长期记忆和技能扩展等能力。
## 系统架构
CowAgent 的整体架构由以下核心模块组成:
<img src="https://cdn.link-ai.tech/doc/68ef7b212c6f791e0e74314b912149f9-sz_5847990.png" alt="CowAgent Architecture" />
### 核心模块说明
| 模块 | 说明 |
| --- | --- |
| **Channels** | 消息通道层,负责接收和发送消息,支持 Web、飞书、钉钉、企微、公众号等 |
| **Agent Core** | 智能体核心引擎,包括任务规划、记忆系统和技能引擎 |
| **Tools** | 工具层Agent 通过工具访问操作系统资源,内置 10+ 种工具 |
| **Models** | 模型层,支持国内外主流大语言模型的统一接入 |
## Agent 模式
启用 Agent 模式后CowAgent 会以自主智能体的方式运行,核心工作流如下:
1. **接收消息** - 通过通道接收用户输入
2. **理解意图** - 分析任务需求和上下文
3. **规划任务** - 将复杂任务分解为多个步骤
4. **调用工具** - 选择合适的工具执行每个步骤
5. **记忆更新** - 将重要信息存入长期记忆
6. **返回结果** - 将执行结果发送回用户
## 工作空间
Agent 的工作空间默认位于 `~/cow` 目录,用于存储系统提示词、记忆文件、技能文件等:
```
~/cow/
├── system.md # Agent system prompt
├── user.md # User profile
├── memory/ # Long-term memory storage
│ ├── core.md # Core memory
│ └── daily/ # Daily memory
├── skills/ # Custom skills
│ ├── skill-1/
│ └── skill-2/
└── .env # Secret keys for skills
```
## 核心配置
在 `config.json` 中配置 Agent 模式的核心参数:
```json
{
"agent": true,
"agent_workspace": "~/cow",
"agent_max_context_tokens": 40000,
"agent_max_context_turns": 30,
"agent_max_steps": 15
}
```
| 参数 | 说明 | 默认值 |
| --- | --- | --- |
| `agent` | 是否启用 Agent 模式 | `true` |
| `agent_workspace` | 工作空间路径 | `~/cow` |
| `agent_max_context_tokens` | 最大上下文 token 数 | `40000` |
| `agent_max_context_turns` | 最大上下文记忆轮次 | `30` |
| `agent_max_steps` | 单次任务最大决策步数 | `15` |

105
docs/intro/features.mdx Normal file
View File

@@ -0,0 +1,105 @@
---
title: 功能介绍
description: CowAgent 长期记忆、任务规划、技能系统详细说明
---
## 1. 长期记忆
> 记忆系统让 Agent 能够长期记住重要信息。Agent 会在用户分享偏好、决策、事实等重要信息时主动存储,也会在对话达到一定长度时自动提取摘要。记忆分为核心记忆、天级记忆,支持语义搜索和向量检索的混合检索模式。
第一次启动 Agent 时Agent 会主动询问关键信息,并记录至工作空间(默认 `~/cow`)中的智能体设定、用户身份、记忆文件中。
在后续的长期对话中Agent 会在需要时智能记录或检索记忆,并对自身设定、用户偏好、记忆文件等进行不断更新,总结和记录经验和教训,真正实现自主思考和不断成长。
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260203000455.png" width="800" />
</Frame>
## 2. 任务规划和工具调用
工具是 Agent 访问操作系统资源的核心Agent 会根据任务需求智能选择和调用工具,完成文件读写、命令执行、定时任务等各类操作。内置工具的实现在项目的 `agent/tools/` 目录下。
**主要工具:** 文件读写编辑、Bash 终端、文件发送、定时调度、记忆搜索、联网搜索、环境配置等。
### 2.1 终端和文件访问
针对操作系统的终端和文件的访问能力,是最基础和核心的工具,其他很多工具或技能都是基于此进行扩展。用户可通过手机端与 Agent 交互,操作个人电脑或服务器上的资源:
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202181130.png" width="800" />
</Frame>
### 2.2 编程能力
基于编程能力和系统访问能力Agent 可以实现从信息搜索、图片等素材生成、编码、测试、部署、Nginx 配置修改、发布的 **Vibecoding 全流程**,通过手机端简单的一句命令完成应用的快速 demo
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260203121008.png" width="800" />
</Frame>
### 2.3 定时任务
基于 `scheduler` 工具实现动态定时任务,支持**一次性任务、固定时间间隔、Cron 表达式**三种形式,任务触发可选择**固定消息发送**或 **Agent 动态任务**执行两种模式:
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202195402.png" width="800" />
</Frame>
### 2.4 环境变量管理
技能所需的秘钥存储在环境变量文件中,由 `env_config` 工具进行管理,你可以通过对话的方式更新秘钥,工具内置安全保护和脱敏策略:
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202234939.png" width="800" />
</Frame>
## 3. 技能系统
技能系统为 Agent 提供无限的扩展性,每个 Skill 由说明文件、运行脚本(可选)、资源(可选)组成,描述如何完成特定类型的任务。通过 Skill 可以让 Agent 遵循说明完成复杂流程、调用各类工具或对接第三方系统。
- **内置技能:** 在项目的 `skills/` 目录下包含技能创造器、图像识别、LinkAI 智能体、网页抓取等。内置 Skill 根据依赖条件API Key、系统命令等自动判断是否启用。
- **自定义技能:** 由用户通过对话创建,存放在工作空间中(`~/cow/skills/`),可实现任何复杂的业务流程和第三方系统对接。
### 3.1 创建技能
通过 `skill-creator` 技能可以通过对话的方式快速创建技能。你可以让 Agent 将某个工作流程固化为技能,或者把任意接口文档和示例发送给 Agent让他直接完成对接
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202202247.png" width="800" />
</Frame>
### 3.2 搜索和图像识别
- **联网搜索:** 内置 `web_search` 工具,支持多种搜索引擎,配置 `BOCHA_API_KEY` 或 `LINKAI_API_KEY` 后启用。
- **图像识别:** 内置 `openai-image-vision` 技能,可使用 `gpt-4.1-mini`、`gpt-4.1` 等模型,依赖 `OPENAI_API_KEY`。
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202213219.png" width="800" />
</Frame>
### 3.3 三方知识库和插件
`linkai-agent` 技能可以将 [LinkAI](https://link-ai.tech/) 上的所有智能体作为 Skill 交给 Agent 使用,实现多智能体决策效果。
配置方式:通过 `env_config` 配置 `LINKAI_API_KEY`,并在 `skills/linkai-agent/config.json` 中添加智能体说明:
```json
{
"apps": [
{
"app_code": "G7z6vKwp",
"app_name": "LinkAI客服助手",
"app_description": "当用户需要了解LinkAI平台相关问题时才选择该助手"
},
{
"app_code": "SFY5x7JR",
"app_name": "内容创作助手",
"app_description": "当用户需要创作图片或视频时才使用该助手"
}
]
}
```
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202234350.png" width="750" />
</Frame>

62
docs/intro/index.mdx Normal file
View File

@@ -0,0 +1,62 @@
---
title: 项目介绍
description: CowAgent - 基于大模型的超级AI助理
---
<img src="https://cdn.link-ai.tech/doc/78c5dd674e2c828642ecc0406669fed7.png" alt="CowAgent" width="600px"/>
**CowAgent** 是基于大模型的超级AI助理能够主动思考和任务规划、操作计算机和外部资源、创造和执行Skills、拥有长期记忆并不断成长。
CowAgent 支持灵活切换多种模型能处理文本、语音、图片、文件等多模态消息可接入网页、飞书、钉钉、企业微信应用、微信公众号中使用7×24小时运行于你的个人电脑或服务器中。
<Card title="GitHub" icon="github" href="https://github.com/zhayujie/chatgpt-on-wechat">
github.com/zhayujie/chatgpt-on-wechat
</Card>
## 核心能力
<CardGroup cols={2}>
<Card title="复杂任务规划" icon="brain" href="/intro/architecture">
能够理解复杂任务并自主规划执行,持续思考和调用工具直到完成目标,支持通过工具操作访问文件、终端、浏览器、定时任务等系统资源。
</Card>
<Card title="长期记忆" icon="database" href="/memory">
自动将对话记忆持久化至本地文件和数据库中,包括全局记忆和天级记忆,支持关键词及向量检索。
</Card>
<Card title="技能系统" icon="puzzle-piece" href="/skills/index">
实现了Skills创建和运行的引擎内置多种技能并支持通过自然语言对话完成自定义Skills开发。
</Card>
<Card title="多模态消息" icon="image" href="/channels/web">
支持对文本、图片、语音、文件等多类型消息进行解析、处理、生成、发送等操作。
</Card>
<Card title="多模型接入" icon="microchip" href="/models/index">
支持 OpenAI, Claude, Gemini, DeepSeek, MiniMax, GLM, Qwen, Kimi, Doubao 等国内外主流模型厂商。
</Card>
<Card title="多端部署" icon="server" href="/channels/web">
支持运行在本地计算机或服务器,可集成到网页、飞书、钉钉、微信公众号、企业微信应用中使用。
</Card>
</CardGroup>
## 快速体验
在终端执行以下命令,即可一键安装、配置、启动 CowAgent
```bash
bash <(curl -sS https://cdn.link-ai.tech/code/cow/run.sh)
```
运行后默认会启动 Web 服务,通过访问 `http://localhost:9899/chat` 在网页端对话。
<CardGroup cols={2}>
<Card title="快速开始" icon="rocket" href="/guide/quick-start">
查看完整的安装和运行指南
</Card>
<Card title="项目架构" icon="sitemap" href="/intro/architecture">
了解 CowAgent 的系统架构设计
</Card>
</CardGroup>
## 社区
添加小助手微信加入开源项目交流群:
<img width="140" src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/open-community.png" />

64
docs/memory.mdx Normal file
View File

@@ -0,0 +1,64 @@
---
title: 长期记忆
description: CowAgent 的长期记忆系统
---
记忆系统让 Agent 能够长期记住重要信息,在对话中不断积累经验、理解用户偏好,真正实现自主思考和持续成长。
## 工作原理
Agent 会在以下场景主动存储记忆:
- **用户分享重要信息时** — 自动识别偏好、决策、事实等关键信息并存储
- **对话达到一定长度时** — 自动提取摘要,避免信息丢失
- **需要检索时** — 智能搜索历史记忆,结合上下文进行回答
## 记忆类型
### 核心记忆
存储在 `~/cow/memory/core.md` 中,包含用户的长期偏好、重要决策、关键事实等不会随时间淡化的信息。
### 天级记忆
存储在 `~/cow/memory/daily/` 目录下,按日期组织,记录每天的对话摘要和关键事件。
## 首次启动
首次启动 Agent 时Agent 会主动向用户询问关键信息,并记录至工作空间(默认 `~/cow`)中:
| 文件 | 说明 |
| --- | --- |
| `system.md` | Agent 的系统提示词和行为设定 |
| `user.md` | 用户身份信息和偏好 |
| `memory/core.md` | 核心记忆 |
| `memory/daily/` | 天级记忆目录 |
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260203000455.png" width="800" />
</Frame>
## 记忆检索
记忆系统支持混合检索模式:
- **关键词检索** — 基于关键词匹配历史记忆
- **向量检索** — 基于语义相似度搜索,即使表述不同也能找到相关记忆
Agent 会在对话中根据需要自动触发记忆检索,将相关历史信息纳入上下文。
## 相关配置
```json
{
"agent_workspace": "~/cow",
"agent_max_context_tokens": 40000,
"agent_max_context_turns": 30
}
```
| 参数 | 说明 | 默认值 |
| --- | --- | --- |
| `agent_workspace` | 工作空间路径,记忆文件存储在此目录下 | `~/cow` |
| `agent_max_context_tokens` | 最大上下文 token 数,影响短期记忆容量 | `40000` |
| `agent_max_context_turns` | 最大上下文轮次,超出后自动丢弃最早对话 | `30` |

17
docs/models/claude.mdx Normal file
View File

@@ -0,0 +1,17 @@
---
title: Claude
description: Claude 模型配置
---
```json
{
"model": "claude-sonnet-4-6",
"claude_api_key": "YOUR_API_KEY"
}
```
| 参数 | 说明 |
| --- | --- |
| `model` | 支持 `claude-sonnet-4-6`、`claude-opus-4-6`、`claude-sonnet-4-5`、`claude-sonnet-4-0`、`claude-3-5-sonnet-latest` 等,参考 [官方模型](https://docs.anthropic.com/en/docs/about-claude/models/overview) |
| `claude_api_key` | 在 [Claude 控制台](https://console.anthropic.com/settings/keys) 创建 |
| `claude_api_base` | 可选,默认为 `https://api.anthropic.com/v1`,修改可接入第三方代理 |

22
docs/models/deepseek.mdx Normal file
View File

@@ -0,0 +1,22 @@
---
title: DeepSeek
description: DeepSeek 模型配置
---
通过 OpenAI 兼容方式接入:
```json
{
"model": "deepseek-chat",
"open_ai_api_key": "YOUR_API_KEY",
"open_ai_api_base": "https://api.deepseek.com/v1",
"bot_type": "chatGPT"
}
```
| 参数 | 说明 |
| --- | --- |
| `model` | `deepseek-chat`DeepSeek-V3、`deepseek-reasoner`DeepSeek-R1 |
| `bot_type` | 固定为 `chatGPT`OpenAI 兼容方式) |
| `open_ai_api_key` | 在 [DeepSeek 平台](https://platform.deepseek.com/api_keys) 创建 |
| `open_ai_api_base` | DeepSeek 平台 BASE URL |

19
docs/models/doubao.mdx Normal file
View File

@@ -0,0 +1,19 @@
---
title: 豆包 Doubao
description: 豆包 (火山方舟) 模型配置
---
# 豆包 (Doubao)
```json
{
"model": "doubao-seed-2-0-code-preview-260215",
"ark_api_key": "YOUR_API_KEY"
}
```
| 参数 | 说明 |
| --- | --- |
| `model` | 可填 `doubao-seed-2-0-code-preview-260215`、`doubao-seed-2-0-pro-260215`、`doubao-seed-2-0-lite-260215` 等 |
| `ark_api_key` | 在 [火山方舟控制台](https://console.volcengine.com/ark/region:ark+cn-beijing/apikey) 创建 |
| `ark_base_url` | 可选,默认为 `https://ark.cn-beijing.volces.com/api/v3` |

16
docs/models/gemini.mdx Normal file
View File

@@ -0,0 +1,16 @@
---
title: Gemini
description: Google Gemini 模型配置
---
```json
{
"model": "gemini-3.1-pro-preview",
"gemini_api_key": "YOUR_API_KEY"
}
```
| 参数 | 说明 |
| --- | --- |
| `model` | 支持 `gemini-3.1-pro-preview`、`gemini-3-flash-preview`、`gemini-3-pro-preview`、`gemini-2.5-pro`、`gemini-2.0-flash` 等,参考 [官方文档](https://ai.google.dev/gemini-api/docs/models) |
| `gemini_api_key` | 在 [Google AI Studio](https://aistudio.google.com/app/apikey) 创建 |

29
docs/models/glm.mdx Normal file
View File

@@ -0,0 +1,29 @@
---
title: 智谱 GLM
description: 智谱AI GLM 模型配置
---
# 智谱AI (GLM)
```json
{
"model": "glm-5",
"zhipu_ai_api_key": "YOUR_API_KEY"
}
```
| 参数 | 说明 |
| --- | --- |
| `model` | 可填 `glm-5`、`glm-4.7`、`glm-4-plus`、`glm-4-flash`、`glm-4-air` 等,参考 [模型编码](https://bigmodel.cn/dev/api/normal-model/glm-4) |
| `zhipu_ai_api_key` | 在 [智谱AI 控制台](https://www.bigmodel.cn/usercenter/proj-mgmt/apikeys) 创建 |
也支持 OpenAI 兼容方式接入:
```json
{
"bot_type": "chatGPT",
"model": "glm-5",
"open_ai_api_base": "https://open.bigmodel.cn/api/paas/v4",
"open_ai_api_key": "YOUR_API_KEY"
}
```

55
docs/models/index.mdx Normal file
View File

@@ -0,0 +1,55 @@
---
title: 模型概览
description: CowAgent 支持的模型及推荐选择
---
CowAgent 支持国内外主流厂商的大语言模型,模型接口实现在项目的 `models/` 目录下。
<Note>
Agent 模式下推荐使用以下模型可根据效果及成本综合选择MiniMax-M2.5、glm-5、kimi-k2.5、qwen3.5-plus、claude-sonnet-4-6、gemini-3.1-pro-preview
</Note>
## 配置方式
根据所选模型,在 `config.json` 中填写对应的模型名称和 API Key 即可。每个模型也支持 OpenAI 兼容方式接入,将 `bot_type` 设为 `chatGPT`,配置 `open_ai_api_base` 和 `open_ai_api_key`。
同时支持使用 [LinkAI](https://link-ai.tech) 平台接口,可灵活切换多种模型并支持知识库、工作流等 Agent 能力。
## 支持的模型
<CardGroup cols={2}>
<Card title="MiniMax" href="/models/minimax">
MiniMax-M2.5 等系列模型
</Card>
<Card title="智谱 GLM" href="/models/glm">
glm-5、glm-4.7 等系列模型
</Card>
<Card title="通义千问 Qwen" href="/models/qwen">
qwen3.5-plus、qwen3-max 等
</Card>
<Card title="Kimi" href="/models/kimi">
kimi-k2.5、kimi-k2 等
</Card>
<Card title="豆包 Doubao" href="/models/doubao">
doubao-seed 系列模型
</Card>
<Card title="Claude" href="/models/claude">
claude-sonnet-4-6 等
</Card>
<Card title="Gemini" href="/models/gemini">
gemini-3.1-pro-preview 等
</Card>
<Card title="OpenAI" href="/models/openai">
gpt-4.1、o 系列等
</Card>
<Card title="DeepSeek" href="/models/deepseek">
deepseek-chat、deepseek-reasoner
</Card>
<Card title="LinkAI" href="/models/linkai">
多模型统一接口 + 知识库
</Card>
</CardGroup>
<Tip>
全部模型名称可参考项目 [`common/const.py`](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/common/const.py) 文件。
</Tip>

29
docs/models/kimi.mdx Normal file
View File

@@ -0,0 +1,29 @@
---
title: Kimi
description: Kimi (Moonshot) 模型配置
---
# Kimi (Moonshot)
```json
{
"model": "kimi-k2.5",
"moonshot_api_key": "YOUR_API_KEY"
}
```
| 参数 | 说明 |
| --- | --- |
| `model` | 可填 `kimi-k2.5`、`kimi-k2`、`moonshot-v1-8k`、`moonshot-v1-32k`、`moonshot-v1-128k` |
| `moonshot_api_key` | 在 [Moonshot 控制台](https://platform.moonshot.cn/console/api-keys) 创建 |
也支持 OpenAI 兼容方式接入:
```json
{
"bot_type": "chatGPT",
"model": "kimi-k2.5",
"open_ai_api_base": "https://api.moonshot.cn/v1",
"open_ai_api_key": "YOUR_API_KEY"
}
```

23
docs/models/linkai.mdx Normal file
View File

@@ -0,0 +1,23 @@
---
title: LinkAI
description: 通过 LinkAI 平台统一接入多种模型
---
通过 [LinkAI](https://link-ai.tech) 平台可灵活切换 OpenAI、Claude、Gemini、DeepSeek、Qwen、Kimi 等多种模型,并支持知识库、工作流、插件等 Agent 能力。
```json
{
"use_linkai": true,
"linkai_api_key": "YOUR_API_KEY",
"linkai_app_code": "YOUR_APP_CODE"
}
```
| 参数 | 说明 |
| --- | --- |
| `use_linkai` | 设为 `true` 启用 LinkAI 接口 |
| `linkai_api_key` | 在 [控制台](https://link-ai.tech/console/interface) 创建 |
| `linkai_app_code` | LinkAI 智能体(应用或工作流)的 code选填 |
| `model` | 留空则使用智能体默认模型,可在平台中灵活切换,[模型列表](https://link-ai.tech/console/models) 中的全部模型均可使用 |
参考 [接口文档](https://docs.link-ai.tech/platform/api) 了解更多。

27
docs/models/minimax.mdx Normal file
View File

@@ -0,0 +1,27 @@
---
title: MiniMax
description: MiniMax 模型配置
---
```json
{
"model": "MiniMax-M2.5",
"minimax_api_key": "YOUR_API_KEY"
}
```
| 参数 | 说明 |
| --- | --- |
| `model` | 可填 `MiniMax-M2.5`、`MiniMax-M2.1`、`MiniMax-M2.1-lightning`、`MiniMax-M2` 等 |
| `minimax_api_key` | 在 [MiniMax 控制台](https://platform.minimaxi.com/user-center/basic-information/interface-key) 创建 |
也支持 OpenAI 兼容方式接入:
```json
{
"bot_type": "chatGPT",
"model": "MiniMax-M2.5",
"open_ai_api_base": "https://api.minimaxi.com/v1",
"open_ai_api_key": "YOUR_API_KEY"
}
```

19
docs/models/openai.mdx Normal file
View File

@@ -0,0 +1,19 @@
---
title: OpenAI
description: OpenAI 模型配置
---
```json
{
"model": "gpt-4.1-mini",
"open_ai_api_key": "YOUR_API_KEY",
"open_ai_api_base": "https://api.openai.com/v1"
}
```
| 参数 | 说明 |
| --- | --- |
| `model` | 与 OpenAI 接口的 [model 参数](https://platform.openai.com/docs/models) 一致,支持 o 系列、gpt-5.2、gpt-5.1、gpt-4.1 等 |
| `open_ai_api_key` | 在 [OpenAI 平台](https://platform.openai.com/api-keys) 创建 |
| `open_ai_api_base` | 可选,修改可接入第三方代理接口 |
| `bot_type` | 使用 OpenAI 官方模型时无需填写。当通过代理接口使用 Claude 等非 OpenAI 模型时,设为 `chatGPT` |

29
docs/models/qwen.mdx Normal file
View File

@@ -0,0 +1,29 @@
---
title: 通义千问 Qwen
description: 通义千问模型配置
---
# 通义千问 (Qwen)
```json
{
"model": "qwen3.5-plus",
"dashscope_api_key": "YOUR_API_KEY"
}
```
| 参数 | 说明 |
| --- | --- |
| `model` | 可填 `qwen3.5-plus`、`qwen3-max`、`qwen-max`、`qwen-plus`、`qwen-turbo`、`qwq-plus` 等 |
| `dashscope_api_key` | 在 [百炼控制台](https://bailian.console.aliyun.com/?tab=model#/api-key) 创建,参考 [官方文档](https://bailian.console.aliyun.com/?tab=api#/api) |
也支持 OpenAI 兼容方式接入:
```json
{
"bot_type": "chatGPT",
"model": "qwen3.5-plus",
"open_ai_api_base": "https://dashscope.aliyuncs.com/compatible-mode/v1",
"open_ai_api_key": "YOUR_API_KEY"
}
```

View File

@@ -1,121 +0,0 @@
# CowAgent 2.0
🚀 CowAgent 2.0 实现了从聊天机器人到**超级智能助理**的全面升级!现在它能够主动思考和规划任务、拥有长期记忆、操作计算机和外部资源、创造和执行技能,真正理解你并和你一起成长。
### ✨ 重点更新
- Agent核心能力
- **复杂任务规划**:能够理解复杂任务并自主规划执行,持续思考和调用工具直到完成目标,支持多轮推理和上下文理解。
- **长期记忆**:自动将对话记忆持久化至本地文件和数据库中,包括全局记忆和天级记忆,支持关键词及向量检索。
- **内置系统工具**内置实现10+种工具包括文件操作、bash终端、浏览器、文件发送、定时任务、记忆管理等。
- **Skills**新增Skill运行引擎内置多种技能并支持通过自然语言对话完成自定义Skills开发。
- **安全和成本**通过秘钥管理工具、提示词控制、系统权限等手段控制Agent的访问安全通过最大记忆轮次、最大上下文token、工具执行步数对token成本进行限制。
- 其他更新:
- 渠道优化:飞书及钉钉接入渠道支持长连接接入(无需公网IP)、支持图片/文件消息的接收和发送。
- 模型更新新增claude-sonnet-4-5、gemini-3-pro-preview、glm-4.7、MiniMax-M2.1、qwen3-max等最新模型。
- 部署优化:增加一键安装、配置、运行、管理的脚本,简化部署流程。
## 一、长期记忆系统
Agent 会在用户分享重要信息时主动存储,也会在对话达到一定长度时自动提取摘要。支持语义搜索和向量检索的混合检索模式。
**首次启动**时Agent 会主动询问关键信息,并记录至工作空间(默认 `~/cow`)中的智能体设定、用户身份、记忆文件中。
**长期对话**中Agent 会智能记录或检索记忆,不断更新自身设定、用户偏好,总结经验和教训,真正实现自主思考和持续成长。
<img width="800" src="https://cdn.link-ai.tech/doc/20260203000455.png">
## 二、任务规划与工具调用
Agent 根据任务需求智能选择和调用工具,完成各类复杂操作。
### 1. 终端和文件访问
最基础和核心的工具能力,用户可通过手机端与 Agent 交互,操作个人电脑或服务器上的资源:
<img width="800" src="https://cdn.link-ai.tech/doc/20260202181130.png">
### 2. 应用编程能力
基于编程能力和系统访问能力Agent 可实现从信息搜索、素材生成、编码、测试、部署、Nginx配置、发布的 **Vibecoding 全流程**,通过手机端一句命令完成应用快速 demo。
<img width="800" src="https://cdn.link-ai.tech/doc/20260203121008.png">
### 3. 定时任务
支持 **一次性任务、固定时间间隔、Cron表达式** 三种形式,任务触发可选择 **固定消息发送****Agent动态任务执行** 两种模式:
<img width="800" src="https://cdn.link-ai.tech/doc/20260202195402.png">
### 4. 环境变量管理
通过 `env_config` 工具管理技能所需秘钥,支持对话式更新,内置安全保护和脱敏策略:
<img width="800" src="https://cdn.link-ai.tech/doc/20260202234939.png">
## 三、技能系统
每个 Skill 由说明文件、运行脚本(可选)、资源(可选)组成,为 Agent 提供无限扩展性。
### 1. 技能创造器
通过对话方式快速创建技能,将工作流程固化或对接任意第三方接口:
<img width="800" src="https://cdn.link-ai.tech/doc/20260202202247.png">
### 2. 搜索和图像识别
- **搜索技能**:内置 `bocha-search`(博查搜索),配置 `BOCHA_SEARCH_API_KEY` 即可使用。
- **图像识别**:支持 `gpt-4.1-mini``gpt-4.1` 等模型,配置 `OPENAI_API_KEY` 即可使用。
<img width="800" src="https://cdn.link-ai.tech/doc/20260202213219.png">
### 3. 三方知识库和插件
`linkai-agent` 技能可将 [LinkAI](https://link-ai.tech/) 上的所有智能体作为 skill 使用,实现多智能体决策:
<img width="750" src="https://cdn.link-ai.tech/doc/20260202234350.png">
## 四、快速开始
### 一键启动
本次新增了一键下载、配置、运行和管理的脚本,只需命令行中执行:
```bash
bash <(curl -sS https://cdn.link-ai.tech/code/cow/run.sh)
```
详细说明参考:[项目启动脚本](https://github.com/zhayujie/chatgpt-on-wechat/wiki/CowAgentQuickStart)
### 模型选择
Agent 模式推荐使用以下模型:
- **Claude**: `claude-sonnet-4-5``claude-sonnet-4-0`
- **Gemini**: `gemini-3-flash-preview``gemini-3-pro-preview`
- **GLM**: `glm-4.7`
- **MiniMax**: `MiniMax-M2.1`
- **Qwen**: `qwen3-max`
详细配置方式参考 [README.md 模型说明](../README.md#模型说明)
### 渠道接入
支持在 Web、飞书、钉钉、企业微信 等多渠道与 Agent 交互,随时随地使用超级助理,只需修改 `config.json` 中的 `channel_type` 配置即可切换:
- **Web网页**:默认使用该渠道,运行后监听本地端口,通过浏览器访问。
- **飞书接入**[飞书接入文档](https://docs.link-ai.tech/cow/multi-platform/feishu)
- **钉钉接入**[钉钉接入文档](https://docs.link-ai.tech/cow/multi-platform/dingtalk)
- **企业微信应用接入**[企微应用文档](https://docs.link-ai.tech/cow/multi-platform/wechat-com)
更多渠道配置参考:[通道说明](../README.md#通道说明)
## 五、参与共建
2.0版本后项目将持续升级Agent能力、拓展接入渠道、内置工具、技能系统降低模型成本和提升安全性。欢迎 [提出反馈](https://github.com/zhayujie/chatgpt-on-wechat/issues) 和 [贡献代码](https://github.com/zhayujie/chatgpt-on-wechat/pulls)。
**🤖立即体验 CowAgent 2.0开启你的超级AI助理之旅**

View File

@@ -1,51 +0,0 @@
## 更新日志
>**2025.05.23** [1.7.6版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.7.6) 优化web网页channel、新增 [AgentMesh多智能体插件](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/plugins/agent/README.md)、百度语音合成优化、企微应用`access_token`获取优化、支持`claude-4-sonnet``claude-4-opus`模型
>**2025.04.11** [1.7.5版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.7.5) 新增支持 [wechatferry](https://github.com/zhayujie/chatgpt-on-wechat/pull/2562) 协议、新增 deepseek 模型、新增支持腾讯云语音能力、新增支持 ModelScope 和 Gitee-AI API接口
>**2024.12.13** [1.7.4版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.7.4) 新增 Gemini 2.0 模型、新增web channel、解决内存泄漏问题、解决 `#reloadp` 命令重载不生效问题
>**2024.10.31** [1.7.3版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.7.3) 程序稳定性提升、数据库功能、Claude模型优化、linkai插件优化、离线通知
>**2024.09.26** [1.7.2版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.7.2) 和 [1.7.1版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.7.1) 新增一键安装和管理脚本、文心讯飞等模型优化、o1 模型
>**2024.08.02** [1.7.0版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.7.0) 新增 讯飞4.0 模型、知识库引用来源展示、相关插件优化
>**2024.07.19** [1.6.9版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.6.9) 新增 gpt-4o-mini 模型、阿里语音识别、企微应用渠道路由优化
>**2024.07.05** [1.6.8版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.6.8) 和 [1.6.7版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.6.7)Claude3.5, Gemini 1.5 Pro, MiniMax模型、工作流图片输入、模型列表完善
>**2024.06.04** [1.6.6版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.6.6) 和 [1.6.5版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.6.5)gpt-4o模型、钉钉流式卡片、讯飞语音识别/合成
>**2024.04.26** [1.6.0版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.6.0),新增 Kimi 接入、gpt-4-turbo版本升级、文件总结和语音识别问题修复
>**2024.03.26** [1.5.8版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.5.8) 和 [1.5.7版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.5.7),新增 GLM-4、Claude-3 模型edge-tts 语音支持
>**2024.01.26** [1.5.6版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.5.6) 和 [1.5.5版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.5.5)钉钉接入tool插件升级4-turbo模型更新
>**2023.11.11** [1.5.3版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.5.3) 和 [1.5.4版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.5.4)新增通义千问模型、Google Gemini
>**2023.11.10** [1.5.2版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.5.2),新增飞书通道、图像识别对话、黑名单配置
>**2023.11.10** [1.5.0版本](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/1.5.0),新增 `gpt-4-turbo`, `dall-e-3`, `tts` 模型接入,完善图像理解&生成、语音识别&生成的多模态能力
>**2023.10.16** 支持通过意图识别使用LinkAI联网搜索、数学计算、网页访问等插件参考[插件文档](https://docs.link-ai.tech/platform/plugins)
>**2023.09.26** 插件增加 文件/文章链接 一键总结和对话的功能,使用参考:[插件说明](https://github.com/zhayujie/chatgpt-on-wechat/tree/master/plugins/linkai#3%E6%96%87%E6%A1%A3%E6%80%BB%E7%BB%93%E5%AF%B9%E8%AF%9D%E5%8A%9F%E8%83%BD)
>**2023.08.08** 接入百度文心一言模型,通过 [插件](https://github.com/zhayujie/chatgpt-on-wechat/tree/master/plugins/linkai) 支持 Midjourney 绘图
>**2023.06.12** 接入 [LinkAI](https://link-ai.tech/console) 平台,可在线创建领域知识库,打造专属客服机器人。使用参考 [接入文档](https://link-ai.tech/platform/link-app/wechat)。
> **2023.04.26** 支持企业微信应用号部署,兼容插件,并支持语音图片交互,私人助理理想选择,使用文档。(contributed by @lanvent in #944)
> **2023.04.05** 支持微信公众号部署,兼容插件,并支持语音图片交互,使用文档。(contributed by @JS00000 in #686)
> **2023.04.05** 增加能让ChatGPT使用工具的tool插件使用文档。工具相关issue可反馈至chatgpt-tool-hub。(contributed by @goldfishh in #663)
> **2023.03.25** 支持插件化开发,目前已实现 多角色切换、文字冒险游戏、管理员指令、Stable Diffusion等插件使用参考 #578。(contributed by @lanvent in #565)
> **2023.03.09** 基于 whisper API(后续已接入更多的语音API服务) 实现对语音消息的解析和回复,添加配置项 "speech_recognition":true 即可启用,使用参考 #415。(contributed by wanggang1987 in #385)
> **2022.12.12:** 项目框架搭建首次接入ChatGPT模型

View File

@@ -0,0 +1,24 @@
---
title: 更新日志
description: CowAgent 版本更新历史
---
| 版本 | 日期 | 说明 |
| --- | --- | --- |
| [2.0.1](/releases/v2.0.1) | 2026.02.27 | 内置 Web Search 工具、智能上下文管理、多项修复 |
| [2.0.0](/releases/v2.0.0) | 2026.02.03 | 全面升级为超级 Agent 助理 |
| 1.7.6 | 2025.05.23 | Web Channel 优化、AgentMesh 多智能体插件 |
| 1.7.5 | 2025.04.11 | DeepSeek 模型 |
| 1.7.4 | 2024.12.13 | Gemini 2.0 模型、Web Channel |
| 1.7.3 | 2024.10.31 | 稳定性提升、数据库功能 |
| 1.7.2 | 2024.09.26 | 一键安装脚本、o1 模型 |
| 1.7.0 | 2024.08.02 | 讯飞 4.0 模型、知识库引用 |
| 1.6.9 | 2024.07.19 | gpt-4o-mini、阿里语音识别 |
| 1.6.8 | 2024.07.05 | Claude 3.5、Gemini 1.5 Pro |
| 1.6.0 | 2024.04.26 | Kimi 接入、gpt-4-turbo 升级 |
| 1.5.8 | 2024.03.26 | GLM-4、Claude-3、edge-tts |
| 1.5.2 | 2023.11.10 | 飞书通道、图像识别对话 |
| 1.5.0 | 2023.11.10 | gpt-4-turbo、dall-e-3、tts 多模态 |
| 1.0.0 | 2022.12.12 | 项目创建,首次接入 ChatGPT 模型 |
更多历史版本请查看 [GitHub Releases](https://github.com/zhayujie/chatgpt-on-wechat/releases)。

105
docs/releases/v2.0.0.mdx Normal file
View File

@@ -0,0 +1,105 @@
---
title: v2.0.0
description: CowAgent 2.0 - 从聊天机器人到超级智能助理的全面升级
---
CowAgent 2.0 实现了从聊天机器人到**超级智能助理**的全面升级!现在它能够主动思考和规划任务、拥有长期记忆、操作计算机和外部资源、创造和执行技能,真正理解你并和你一起成长。
**发布日期**2026.02.03 | [GitHub Release](https://github.com/zhayujie/chatgpt-on-wechat/releases/tag/2.0.0)
## 重点更新
### Agent 核心能力
- **复杂任务规划**:能够理解复杂任务并自主规划执行,持续思考和调用工具直到完成目标,支持多轮推理和上下文理解
- **长期记忆**:自动将对话记忆持久化至本地文件和数据库中,包括全局记忆和天级记忆,支持关键词及向量检索
- **内置系统工具**:内置实现 10+ 种工具包括文件操作、Bash 终端、浏览器、文件发送、定时任务、记忆管理等
- **Skills**:新增 Skill 运行引擎,内置多种技能,并支持通过自然语言对话完成自定义 Skills 开发
- **安全和成本**:通过秘钥管理工具、提示词控制、系统权限等手段控制 Agent 的访问安全;通过最大记忆轮次、最大上下文 token、工具执行步数对 token 成本进行限制
### 其他更新
- **渠道优化**:飞书及钉钉接入渠道支持长连接接入(无需公网 IP、支持图片/文件消息的接收和发送
- **模型更新**:新增 claude-sonnet-4-5、gemini-3-pro-preview、glm-4.7、MiniMax-M2.1、qwen3-max 等最新模型
- **部署优化**:增加一键安装、配置、运行、管理的脚本,简化部署流程
## 长期记忆系统
Agent 会在用户分享重要信息时主动存储,也会在对话达到一定长度时自动提取摘要。支持语义搜索和向量检索的混合检索模式。
**首次启动**时Agent 会主动询问关键信息,并记录至工作空间(默认 `~/cow`)中的智能体设定、用户身份、记忆文件中。
**长期对话**中Agent 会智能记录或检索记忆,不断更新自身设定、用户偏好,总结经验和教训,真正实现自主思考和持续成长。
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260203000455.png" width="800" />
</Frame>
## 任务规划与工具调用
Agent 根据任务需求智能选择和调用工具,完成各类复杂操作。
### 终端和文件访问
最基础和核心的工具能力,用户可通过手机端与 Agent 交互,操作个人电脑或服务器上的资源:
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202181130.png" width="800" />
</Frame>
### 应用编程能力
基于编程能力和系统访问能力Agent 可实现从信息搜索、素材生成、编码、测试、部署、Nginx 配置、发布的 **Vibecoding 全流程**,通过手机端一句命令完成应用快速 demo。
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260203121008.png" width="800" />
</Frame>
### 定时任务
支持 **一次性任务、固定时间间隔、Cron 表达式** 三种形式,任务触发可选择 **固定消息发送** 或 **Agent 动态任务执行** 两种模式:
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202195402.png" width="800" />
</Frame>
### 环境变量管理
通过 `env_config` 工具管理技能所需秘钥,支持对话式更新,内置安全保护和脱敏策略:
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202234939.png" width="800" />
</Frame>
## 技能系统
每个 Skill 由说明文件、运行脚本(可选)、资源(可选)组成,为 Agent 提供无限扩展性。
### 技能创造器
通过对话方式快速创建技能,将工作流程固化或对接任意第三方接口:
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202202247.png" width="800" />
</Frame>
### 网页搜索和图像识别
- **网页搜索**:内置 `web_search` 工具,支持多种搜索引擎,配置对应 API Key 即可使用
- **图像识别**:支持 `gpt-4.1-mini`、`gpt-4.1` 等模型,配置 `OPENAI_API_KEY` 即可使用
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202213219.png" width="800" />
</Frame>
### 三方知识库和插件
`linkai-agent` 技能可将 [LinkAI](https://link-ai.tech/) 上的所有智能体作为 Skill 使用,实现多智能体决策:
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202234350.png" width="750" />
</Frame>
## 参与共建
2.0 版本后,项目将持续升级 Agent 能力、拓展接入渠道、内置工具、技能系统,降低模型成本和提升安全性。欢迎 [提出反馈](https://github.com/zhayujie/chatgpt-on-wechat/issues) 和 [贡献代码](https://github.com/zhayujie/chatgpt-on-wechat/pulls)。

36
docs/releases/v2.0.1.mdx Normal file
View File

@@ -0,0 +1,36 @@
---
title: v2.0.1
description: CowAgent 2.0.1 - 内置 Web Search、智能上下文管理、多项修复
---
**发布日期**2026.02 | [Full Changelog](https://github.com/zhayujie/chatgpt-on-wechat/compare/2.0.0..2.0.1)
## 新特性
- **内置 Web Search 工具**:将网络搜索作为 Agent 内置工具集成,降低决策成本 ([4f0ea5d](https://github.com/zhayujie/chatgpt-on-wechat/commit/4f0ea5d7568d61db91ff69c91c429e785fd1b1c2))
- **Claude Opus 4.6 模型支持**:新增对 Claude Opus 4.6 模型的支持 ([#2661](https://github.com/zhayujie/chatgpt-on-wechat/pull/2661))
- **企业微信图片消息识别**:支持企业微信渠道的图片消息识别功能 ([#2667](https://github.com/zhayujie/chatgpt-on-wechat/pull/2667))
## 优化
- **智能上下文管理**:解决聊天上下文溢出问题,新增智能上下文裁剪策略,防止 token 超限 ([cea7fb7](https://github.com/zhayujie/chatgpt-on-wechat/commit/cea7fb7490c53454602bf05955a0e9f059bcf0fd), [8acf2db](https://github.com/zhayujie/chatgpt-on-wechat/commit/8acf2dbdfe713b84ad74b761b7f86674b1c1904d)) [#2663](https://github.com/zhayujie/chatgpt-on-wechat/issues/2663)
- **运行时信息动态更新**:通过动态函数方案实现系统提示词中时间戳等运行时信息的自动更新 ([#2655](https://github.com/zhayujie/chatgpt-on-wechat/pull/2655), [#2657](https://github.com/zhayujie/chatgpt-on-wechat/pull/2657))
- **Skill 提示词优化**:改进 Skill 系统提示词生成逻辑,简化工具描述,提升 Agent 表现 ([6c21833](https://github.com/zhayujie/chatgpt-on-wechat/commit/6c218331b1f1208ea8be6bf226936d3b556ade3e))
- **智谱 AI 自定义 API Base URL**:支持智谱 AI 配置自定义 API Base URL ([#2660](https://github.com/zhayujie/chatgpt-on-wechat/pull/2660))
- **启动脚本优化**:改进 `run.sh` 脚本的交互体验和配置流程 ([#2656](https://github.com/zhayujie/chatgpt-on-wechat/pull/2656))
- **决策轮次日志**:新增 Agent 决策轮次的日志记录,便于调试 ([cb303e6](https://github.com/zhayujie/chatgpt-on-wechat/commit/cb303e6109c50c8dfef1f5e6c1ec47223bf3cd11))
## 问题修复
- **定时任务记忆丢失**:修复 Scheduler 调度器导致的记忆丢失问题 ([a77a874](https://github.com/zhayujie/chatgpt-on-wechat/commit/a77a8741b500a408c6f5c8868856fb4b018fe9db))
- **空工具调用与超长结果**:修复空 tool calls 及过长工具返回结果的异常处理 ([0542700](https://github.com/zhayujie/chatgpt-on-wechat/commit/0542700f9091ebb08c1a56103b0f0f45f24aa621))
- **OpenAI Function Call**:修复 OpenAI 模型的 function call 调用兼容性问题 ([158c87a](https://github.com/zhayujie/chatgpt-on-wechat/commit/158c87ab8b05bae054cc1b4eacdbb64fc1062ba9))
- **Claude 工具名字段**:移除 Claude 模型响应中多余的 tool name 字段 ([eec10cb](https://github.com/zhayujie/chatgpt-on-wechat/commit/eec10cb5db6a3d5bc12ef606606532237d2c5f6e))
- **MiniMax 推理优化**:优化 MiniMax 模型 reasoning content 处理,隐藏思考过程输出 ([c72cda3](https://github.com/zhayujie/chatgpt-on-wechat/commit/c72cda33864bd1542012ee6e0a8bd8c6c88cb5ed), [72b1cac](https://github.com/zhayujie/chatgpt-on-wechat/commit/72b1cacea1ba0d1f3dedacbab2e088e98fd7e172))
- **智谱 AI 思考过程**:隐藏智谱 AI 模型的思考过程展示 ([72b1cac](https://github.com/zhayujie/chatgpt-on-wechat/commit/72b1cacea1ba0d1f3dedacbab2e088e98fd7e172))
- **飞书连接与证书**:修复飞书渠道的 SSL 证书错误和连接异常问题 ([229b14b](https://github.com/zhayujie/chatgpt-on-wechat/commit/229b14b6fcabe7123d53cab1dea39f38dab26d6d), [8674421](https://github.com/zhayujie/chatgpt-on-wechat/commit/867442155e7f095b4f38b0856f8c1d8312b5fcf7))
- **model_type 类型校验**:修复非字符串 `model_type` 导致的 `AttributeError` ([#2666](https://github.com/zhayujie/chatgpt-on-wechat/pull/2666))
## 平台兼容
- **Windows 兼容性适配**:修复 Windows 平台下路径处理、文件编码及 `os.getuid()` 不可用等问题,涉及多个工具模块 ([051ffd7](https://github.com/zhayujie/chatgpt-on-wechat/commit/051ffd78a372f71a967fd3259e37fe19131f83cf), [5264f7c](https://github.com/zhayujie/chatgpt-on-wechat/commit/5264f7ce18360ee4db5dcb4ebe67307977d40014))

View File

@@ -0,0 +1,33 @@
---
title: 图像识别
description: 使用 OpenAI 视觉模型识别图片
---
# openai-image-vision
使用 OpenAI 的 GPT-4 Vision API 分析图片内容,理解图像中的物体、文字、颜色等元素。
## 依赖
| 依赖 | 说明 |
| --- | --- |
| `OPENAI_API_KEY` | OpenAI API 密钥 |
| `curl`、`base64` | 系统命令(通常已预装) |
配置方式:
- 通过 `env_config` 工具配置 `OPENAI_API_KEY`
- 或在 `config.json` 中填写 `open_ai_api_key`
## 支持的模型
- `gpt-4.1-mini`(推荐,性价比高)
- `gpt-4.1`
## 使用方式
配置完成后,向 Agent 发送图片即可自动触发图像识别。
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202213219.png" width="800" />
</Frame>

67
docs/skills/index.mdx Normal file
View File

@@ -0,0 +1,67 @@
---
title: 技能概览
description: CowAgent 技能系统介绍
---
技能Skill为 Agent 提供无限的扩展性。每个 Skill 由说明文件(`SKILL.md`)、运行脚本(可选)、资源(可选)组成,描述如何完成特定类型的任务。
Skill 与 Tool 的区别Tool 是由代码实现的原子操作如读写文件、执行命令Skill 则是基于说明文件的高级工作流,可以组合调用多个 Tool 来完成复杂任务。
## 内置技能
位于项目 `skills/` 目录下,根据依赖条件自动判断是否启用:
| 技能 | 说明 | 依赖 |
| --- | --- | --- |
| [`skill-creator`](/skills/skill-creator) | 通过对话创建自定义技能 | 无 |
| [`openai-image-vision`](/skills/image-vision) | 使用 OpenAI 视觉模型识别图片 | `OPENAI_API_KEY` |
| [`linkai-agent`](/skills/linkai-agent) | 对接 LinkAI 平台智能体 | `LINKAI_API_KEY` |
| [`web-fetch`](/skills/web-fetch) | 抓取网页文本内容 | `curl`(默认启用) |
## 自定义技能
由用户通过对话创建,存放在工作空间中(`~/cow/skills/`),可实现任何复杂的业务流程和第三方系统对接。
## 技能加载优先级
1. **工作空间技能**(最高):`~/cow/skills/`
2. **项目内置技能**(最低):`skills/`
同名技能按优先级覆盖。
## 技能文件结构
```
skills/
├── my-skill/
│ ├── SKILL.md # Skill description (frontmatter + instructions)
│ ├── scripts/ # Execution scripts (optional)
│ └── resources/ # Additional resources (optional)
```
### SKILL.md 格式
```markdown
---
name: my-skill
description: Brief description of the skill
metadata:
emoji: 🔧
requires:
bins: ["curl"]
env: ["MY_API_KEY"]
primaryEnv: "MY_API_KEY"
---
# My Skill
Detailed instructions...
```
| 字段 | 说明 |
| --- | --- |
| `name` | 技能名称,需与目录名一致 |
| `description` | 技能描述Agent 据此决定是否调用 |
| `metadata.requires.bins` | 依赖的系统命令 |
| `metadata.requires.env` | 依赖的环境变量 |
| `metadata.always` | 是否始终加载(默认 false |

View File

@@ -0,0 +1,49 @@
---
title: LinkAI 智能体
description: 对接 LinkAI 平台的多智能体技能
---
# linkai-agent
将 [LinkAI](https://link-ai.tech/) 平台上的智能体作为 Skill 使用实现多智能体决策。Agent 根据智能体的名称和描述智能选择,通过 `app_code` 调用对应的应用或工作流。
## 依赖
| 依赖 | 说明 |
| --- | --- |
| `LINKAI_API_KEY` | LinkAI 平台 API 密钥,在 [控制台](https://link-ai.tech/console/interface) 创建 |
| `curl` | 系统命令(通常已预装) |
配置方式:
- 通过 `env_config` 工具配置 `LINKAI_API_KEY`
- 或在 `config.json` 中填写 `linkai_api_key`
## 配置智能体
在 `skills/linkai-agent/config.json` 中添加可用的智能体:
```json
{
"apps": [
{
"app_code": "G7z6vKwp",
"app_name": "LinkAI客服助手",
"app_description": "当用户需要了解LinkAI平台相关问题时才选择该助手"
},
{
"app_code": "SFY5x7JR",
"app_name": "内容创作助手",
"app_description": "当用户需要创作图片或视频时才使用该助手"
}
]
}
```
## 使用方式
配置完成后Agent 会根据用户的问题自动选择合适的 LinkAI 智能体进行回答。
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202234350.png" width="750" />
</Frame>

View File

@@ -0,0 +1,33 @@
---
title: 创建技能
description: 通过对话创建自定义技能
---
# skill-creator
通过自然语言对话快速创建、安装或更新技能。
## 依赖
无额外依赖,始终可用。
## 使用方式
- 将工作流程固化为技能:"帮我把这个部署流程创建为一个技能"
- 对接第三方 API"根据这个接口文档创建一个技能"
- 安装远程技能:"帮我安装 xxx 技能"
## 创建流程
1. 告诉 Agent 你想创建的技能功能
2. Agent 自动生成 `SKILL.md` 说明文件和运行脚本
3. 技能保存到工作空间的 `~/cow/skills/` 目录
4. 后续对话中 Agent 会自动识别并使用该技能
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202202247.png" width="800" />
</Frame>
<Tip>
详细开发文档可参考 [Skill 创造器说明](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/skills/skill-creator/SKILL.md)。
</Tip>

33
docs/skills/web-fetch.mdx Normal file
View File

@@ -0,0 +1,33 @@
---
title: 网页抓取
description: 抓取网页文本内容
---
# web-fetch
使用 curl 抓取网页并提取可读文本内容,轻量级的网页访问方式,无需浏览器自动化。
## 依赖
| 依赖 | 说明 |
| --- | --- |
| `curl` | 系统命令(通常已预装) |
该技能设置了 `always: true`,只要系统有 `curl` 命令即默认启用。
## 使用方式
当 Agent 需要获取某个 URL 的网页内容时会自动调用,无需额外配置。
## 与 browser 工具的区别
| 特性 | web-fetch技能 | browser工具 |
| --- | --- | --- |
| 依赖 | 仅 curl | browser-use + playwright |
| JS 渲染 | 不支持 | 支持 |
| 页面交互 | 不支持 | 支持点击、输入等 |
| 适用场景 | 获取静态页面文本 | 操作动态网页 |
<Tip>
对于大多数网页内容获取场景web-fetch 就够用了。只有需要 JS 渲染或页面交互时才需要 browser 工具。
</Tip>

30
docs/tools/bash.mdx Normal file
View File

@@ -0,0 +1,30 @@
---
title: bash - 终端
description: 执行系统命令
---
# bash
在当前工作目录执行 Bash 命令,返回 stdout 和 stderr。`env_config` 中配置的 API Key 会自动注入到环境变量中。
## 依赖
无额外依赖,默认可用。
## 参数
| 参数 | 类型 | 必填 | 说明 |
| --- | --- | --- | --- |
| `command` | string | 是 | 要执行的命令 |
| `timeout` | integer | 否 | 超时时间(秒) |
## 使用场景
- 安装软件包和依赖
- 运行代码和测试
- 部署应用和服务Nginx 配置、进程管理等)
- 系统运维和排查
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260203121008.png" width="800" />
</Frame>

Some files were not shown because too many files have changed in this diff Show More