Compare commits

..

15 Commits

Author SHA1 Message Date
zhayujie
1c71c4e38b feat: agent chat service 2026-02-21 00:39:36 +08:00
zhayujie
5e3eccb3f6 feat: support memory service 2026-02-20 23:44:05 +08:00
zhayujie
e1dc037eb9 feat: cloud skills manage 2026-02-20 23:23:04 +08:00
zhayujie
97e9b4c801 Merge branch 'master' into feat-config-update 2026-02-20 18:58:21 +08:00
zhayujie
52d7cad735 feat: support gemini-3.1-pro-preview and claude-4.6-sonnet 2026-02-20 12:14:59 +08:00
zhayujie
c0b1d270ba Merge branch 'master' of github.com:zhayujie/chatgpt-on-wechat 2026-02-19 14:18:39 +08:00
zhayujie
e59a2892e4 feat: support qwen3.5-plus 2026-02-19 14:18:16 +08:00
zhayujie
5fa0376a49 Merge pull request #2670 from SgtPepper114/fix/gemini-dingtalk-image-inline
fix(gemini): 修复钉钉图片标记未转多模态导致的识图失效
2026-02-19 13:57:04 +08:00
SgtPepper114
05a33042c8 fix(gemini): support dingtalk image markers as multimodal input
- parse [图片: path] markers in text and convert to Gemini inlineData parts

- unify reply path via call_with_tools to reuse multimodal conversion

- keep legacy safety behavior (BLOCK_NONE) and restore safety ratings logging on empty response

- add multimodal request image-part count log for debugging
2026-02-16 13:26:57 +00:00
zhayujie
ce58f23cbc feat: dashscope model name 2026-02-16 20:11:38 +08:00
zhayujie
b6fc9fa370 fix: run script dependency issues 2026-02-15 00:02:50 +08:00
zhayujie
00ae38faae docs: update models in README 2026-02-14 17:36:36 +08:00
zhayujie
ab28ee58ab feat: add doubao-2.0-code model and update README 2026-02-14 16:49:44 +08:00
zhayujie
48db538a2e feat: support Minimax-M2.5, glm-5, kimi-k2.5 2026-02-14 15:27:44 +08:00
zhayujie
46945942e1 feat: support channel start in sub thread 2026-02-13 12:38:52 +08:00
41 changed files with 2883 additions and 530 deletions

118
README.md
View File

@@ -18,7 +18,7 @@
-**长期记忆:** 自动将对话记忆持久化至本地文件和数据库中,包括全局记忆和天级记忆,支持关键词及向量检索
-**技能系统:** 实现了Skills创建和运行的引擎内置多种技能并支持通过自然语言对话完成自定义Skills开发
-**多模态消息:** 支持对文本、图片、语音、文件等多类型消息进行解析、处理、生成、发送等操作
-**多模型接入:** 支持OpenAI, Claude, Gemini, DeepSeek, MiniMax、GLM、Qwen、Kimi等国内外主流模型厂商
-**多模型接入:** 支持OpenAI, Claude, Gemini, DeepSeek, MiniMax、GLM、Qwen、Kimi、Doubao等国内外主流模型厂商
-**多端部署:** 支持运行在本地计算机或服务器,可集成到网页、飞书、钉钉、微信公众号、企业微信应用中使用
-**知识库:** 集成企业知识库能力让Agent成为专属数字员工基于[LinkAI](https://link-ai.tech)平台实现
@@ -90,7 +90,7 @@ bash <(curl -sS https://cdn.link-ai.tech/code/cow/run.sh)
项目支持国内外主流厂商的模型接口,可选模型及配置说明参考:[模型说明](#模型说明)。
> Agent模式下推荐使用以下模型可根据效果及成本综合选择GLM(glm-4.7)、MiniMAx(MiniMax-M2.1)、Qwen(qwen3-max)、Claude(claude-opus-4-6、claude-sonnet-4-5、claude-sonnet-4-0)、Gemini(gemini-3-flash-preview、gemini-3-pro-preview)
> Agent模式下推荐使用以下模型可根据效果及成本综合选择MiniMax-M2.5、glm-5、kimi-k2.5、qwen3.5-plus、claude-sonnet-4-6、gemini-3.1-pro-preview
同时支持使用 **LinkAI平台** 接口,可灵活切换 OpenAI、Claude、Gemini、DeepSeek、Qwen、Kimi 等多种常用模型并支持知识库、工作流、插件等Agent能力参考 [接口文档](https://docs.link-ai.tech/platform/api)。
@@ -136,9 +136,11 @@ pip3 install -r requirements-optional.txt
# config.json 文件内容示例
{
"channel_type": "web", # 接入渠道类型默认为web支持修改为:feishu,dingtalk,wechatcom_app,terminal,wechatmp,wechatmp_service
"model": "MiniMax-M2.1", # 模型名称
"model": "MiniMax-M2.5", # 模型名称
"minimax_api_key": "", # MiniMax API Key
"zhipu_ai_api_key": "", # 智谱GLM API Key
"moonshot_api_key": "", # Kimi/Moonshot API Key
"ark_api_key": "", # 豆包(火山方舟) API Key
"dashscope_api_key": "", # 百炼(通义千问)API Key
"claude_api_key": "", # Claude API Key
"claude_api_base": "https://api.anthropic.com/v1", # Claude API 地址,修改可接入三方代理平台
@@ -173,13 +175,13 @@ pip3 install -r requirements-optional.txt
<details>
<summary>2. 其他配置</summary>
+ `model`: 模型名称Agent模式下推荐使用 `glm-4.7``MiniMax-M2.1``qwen3-max``claude-opus-4-6``claude-sonnet-4-5``claude-sonnet-4-0``gemini-3-flash-preview``gemini-3-pro-preview`,全部模型名称参考[common/const.py](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/common/const.py)文件
+ `model`: 模型名称Agent模式下推荐使用 `MiniMax-M2.5``glm-5``kimi-k2.5``qwen3.5-plus``claude-sonnet-4-6``gemini-3.1-pro-preview`,全部模型名称参考[common/const.py](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/common/const.py)文件
+ `character_desc`普通对话模式下的机器人系统提示词。在Agent模式下该配置不生效由工作空间中的文件内容构成。
+ `subscribe_msg`订阅消息公众号和企业微信channel中请填写当被订阅时会自动回复 可使用特殊占位符。目前支持的占位符有{trigger_prefix}在程序中它会自动替换成bot的触发词。
</details>
<details>
<summary>5. LinkAI配置</summary>
<summary>3. LinkAI配置</summary>
+ `use_linkai`: 是否使用LinkAI接口默认关闭设置为true后可对接LinkAI平台使用知识库、工作流、插件等能力, 参考[接口文档](https://docs.link-ai.tech/platform/api/chat)
+ `linkai_api_key`: LinkAI Api Key可在 [控制台](https://link-ai.tech/console/interface) 创建
@@ -309,24 +311,24 @@ volumes:
```json
{
"model": "MiniMax-M2.1",
"model": "MiniMax-M2.5",
"minimax_api_key": ""
}
```
- `model`: 可填写 `MiniMax-M2.1、MiniMax-M2.1-lightning、MiniMax-M2、abab6.5-chat`
- `model`: 可填写 `MiniMax-M2.5、MiniMax-M2.1、MiniMax-M2.1-lightning、MiniMax-M2、abab6.5-chat`
- `minimax_api_key`MiniMax平台的API-KEY在 [控制台](https://platform.minimaxi.com/user-center/basic-information/interface-key) 创建
方式二OpenAI兼容方式接入配置如下
```json
{
"bot_type": "chatGPT",
"model": "MiniMax-M2.1",
"model": "MiniMax-M2.5",
"open_ai_api_base": "https://api.minimaxi.com/v1",
"open_ai_api_key": ""
}
```
- `bot_type`: OpenAI兼容方式
- `model`: 可填 `MiniMax-M2.1、MiniMax-M2.1-lightning、MiniMax-M2`,参考[API文档](https://platform.minimaxi.com/document/%E5%AF%B9%E8%AF%9D?key=66701d281d57f38758d581d0#QklxsNSbaf6kM4j6wjO5eEek)
- `model`: 可填 `MiniMax-M2.5、MiniMax-M2.1、MiniMax-M2.1-lightning、MiniMax-M2`,参考[API文档](https://platform.minimaxi.com/document/%E5%AF%B9%E8%AF%9D?key=66701d281d57f38758d581d0#QklxsNSbaf6kM4j6wjO5eEek)
- `open_ai_api_base`: MiniMax平台API的 BASE URL
- `open_ai_api_key`: MiniMax平台的API-KEY
</details>
@@ -338,24 +340,24 @@ volumes:
```json
{
"model": "glm-4.7",
"model": "glm-5",
"zhipu_ai_api_key": ""
}
```
- `model`: 可填 `glm-4.7、glm-4-plus、glm-4-flash、glm-4-air、glm-4-airx、glm-4-long` 等, 参考 [glm-4系列模型编码](https://bigmodel.cn/dev/api/normal-model/glm-4)
- `model`: 可填 `glm-5、glm-4.7、glm-4-plus、glm-4-flash、glm-4-air、glm-4-airx、glm-4-long` 等, 参考 [glm系列模型编码](https://bigmodel.cn/dev/api/normal-model/glm-4)
- `zhipu_ai_api_key`: 智谱AI平台的 API KEY在 [控制台](https://www.bigmodel.cn/usercenter/proj-mgmt/apikeys) 创建
方式二OpenAI兼容方式接入配置如下
```json
{
"bot_type": "chatGPT",
"model": "glm-4.7",
"model": "glm-5",
"open_ai_api_base": "https://open.bigmodel.cn/api/paas/v4",
"open_ai_api_key": ""
}
```
- `bot_type`: OpenAI兼容方式
- `model`: 可填 `glm-4.7、glm-4.6、glm-4-plus、glm-4-flash、glm-4-air、glm-4-airx、glm-4-long`
- `model`: 可填 `glm-5、glm-4.7、glm-4-plus、glm-4-flash、glm-4-air、glm-4-airx、glm-4-long`
- `open_ai_api_base`: 智谱AI平台的 BASE URL
- `open_ai_api_key`: 智谱AI平台的 API KEY
</details>
@@ -367,18 +369,18 @@ volumes:
```json
{
"model": "qwen3-max",
"model": "qwen3.5-plus",
"dashscope_api_key": "sk-qVxxxxG"
}
```
- `model`: 可填写 `qwen3-max、qwen-max、qwen-plus、qwen-turbo、qwen-long、qwq-plus`
- `model`: 可填写 `qwen3.5-plus、qwen3-max、qwen-max、qwen-plus、qwen-turbo、qwen-long、qwq-plus`
- `dashscope_api_key`: 通义千问的 API-KEY参考 [官方文档](https://bailian.console.aliyun.com/?tab=api#/api) ,在 [控制台](https://bailian.console.aliyun.com/?tab=model#/api-key) 创建
方式二OpenAI兼容方式接入配置如下
```json
{
"bot_type": "chatGPT",
"model": "qwen3-max",
"model": "qwen3.5-plus",
"open_ai_api_base": "https://dashscope.aliyuncs.com/compatible-mode/v1",
"open_ai_api_key": "sk-qVxxxxG"
}
@@ -389,6 +391,53 @@ volumes:
- `open_ai_api_key`: 通义千问的 API-KEY
</details>
<details>
<summary>Kimi (Moonshot)</summary>
方式一:官方接入,配置如下:
```json
{
"model": "kimi-k2.5",
"moonshot_api_key": ""
}
```
- `model`: 可填写 `kimi-k2.5、kimi-k2、moonshot-v1-8k、moonshot-v1-32k、moonshot-v1-128k`
- `moonshot_api_key`: Moonshot的API-KEY在 [控制台](https://platform.moonshot.cn/console/api-keys) 创建
方式二OpenAI兼容方式接入配置如下
```json
{
"bot_type": "chatGPT",
"model": "kimi-k2.5",
"open_ai_api_base": "https://api.moonshot.cn/v1",
"open_ai_api_key": ""
}
```
- `bot_type`: OpenAI兼容方式
- `model`: 可填写 `kimi-k2.5、kimi-k2、moonshot-v1-8k、moonshot-v1-32k、moonshot-v1-128k`
- `open_ai_api_base`: Moonshot的 BASE URL
- `open_ai_api_key`: Moonshot的 API-KEY
</details>
<details>
<summary>豆包 (Doubao)</summary>
1. API Key创建在 [火山方舟控制台](https://console.volcengine.com/ark/region:ark+cn-beijing/apikey) 创建API Key
2. 填写配置
```json
{
"model": "doubao-seed-2-0-code-preview-260215",
"ark_api_key": "YOUR_API_KEY"
}
```
- `model`: 可填写 `doubao-seed-2-0-code-preview-260215、doubao-seed-2-0-pro-260215、doubao-seed-2-0-lite-260215、doubao-seed-2-0-mini-260215`
- `ark_api_key`: 火山方舟平台的 API Key在 [控制台](https://console.volcengine.com/ark/region:ark+cn-beijing/apikey) 创建
- `ark_base_url`: 可选,默认为 `https://ark.cn-beijing.volces.com/api/v3`
</details>
<details>
<summary>Claude</summary>
@@ -398,11 +447,11 @@ volumes:
```json
{
"model": "claude-sonnet-4-5",
"model": "claude-sonnet-4-6",
"claude_api_key": "YOUR_API_KEY"
}
```
- `model`: 参考 [官方模型ID](https://docs.anthropic.com/en/docs/about-claude/models/overview#model-aliases) ,支持 `claude-opus-4-6、claude-sonnet-4-5、claude-sonnet-4-0、claude-opus-4-0、claude-3-5-sonnet-latest`
- `model`: 参考 [官方模型ID](https://docs.anthropic.com/en/docs/about-claude/models/overview#model-aliases) ,支持 `claude-sonnet-4-6、claude-opus-4-6、claude-sonnet-4-5、claude-sonnet-4-0、claude-opus-4-0、claude-3-5-sonnet-latest`
</details>
<details>
@@ -411,11 +460,11 @@ volumes:
API Key创建在 [控制台](https://aistudio.google.com/app/apikey?hl=zh-cn) 创建API Key ,配置如下
```json
{
"model": "gemini-3-flash-preview",
"model": "gemini-3.1-pro-preview",
"gemini_api_key": ""
}
```
- `model`: 参考[官方文档-模型列表](https://ai.google.dev/gemini-api/docs/models?hl=zh-cn),支持 `gemini-3-flash-preview、gemini-3-pro-preview、gemini-2.5-pro、gemini-2.0-flash`
- `model`: 参考[官方文档-模型列表](https://ai.google.dev/gemini-api/docs/models?hl=zh-cn),支持 `gemini-3.1-pro-preview、gemini-3-flash-preview、gemini-3-pro-preview、gemini-2.5-pro、gemini-2.0-flash`
</details>
<details>
@@ -441,35 +490,6 @@ API Key创建在 [控制台](https://aistudio.google.com/app/apikey?hl=zh-cn)
- `open_ai_api_base`: DeepSeek平台 BASE URL
</details>
<details>
<summary>Kimi (Moonshot)</summary>
方式一:官方接入,配置如下:
```json
{
"model": "moonshot-v1-128k",
"moonshot_api_key": ""
}
```
- `model`: 可填写 `moonshot-v1-8k、moonshot-v1-32k、moonshot-v1-128k`
- `moonshot_api_key`: Moonshot的API-KEY在 [控制台](https://platform.moonshot.cn/console/api-keys) 创建
方式二OpenAI兼容方式接入配置如下
```json
{
"bot_type": "chatGPT",
"model": "moonshot-v1-128k",
"open_ai_api_base": "https://api.moonshot.cn/v1",
"open_ai_api_key": ""
}
```
- `bot_type`: OpenAI兼容方式
- `model`: 可填写 `moonshot-v1-8k、moonshot-v1-32k、moonshot-v1-128k`
- `open_ai_api_base`: Moonshot的 BASE URL
- `open_ai_api_key`: Moonshot的 API-KEY
</details>
<details>
<summary>Azure</summary>

3
agent/chat/__init__.py Normal file
View File

@@ -0,0 +1,3 @@
from agent.chat.service import ChatService
__all__ = ["ChatService"]

168
agent/chat/service.py Normal file
View File

@@ -0,0 +1,168 @@
"""
ChatService - Wraps the Agent stream execution to produce CHAT protocol chunks.
Translates agent events (message_update, message_end, tool_execution_end, etc.)
into the CHAT socket protocol format (content chunks with segment_id, tool_calls chunks).
"""
import time
from typing import Callable, Optional
from common.log import logger
class ChatService:
"""
High-level service that runs an Agent for a given query and streams
the results as CHAT protocol chunks via a callback.
Usage:
svc = ChatService(agent_bridge)
svc.run(query, session_id, send_chunk_fn)
"""
def __init__(self, agent_bridge):
"""
:param agent_bridge: AgentBridge instance (manages agent lifecycle)
"""
self.agent_bridge = agent_bridge
def run(self, query: str, session_id: str, send_chunk_fn: Callable[[dict], None]):
"""
Run the agent for *query* and stream results back via *send_chunk_fn*.
The method blocks until the agent finishes. After it returns the SDK
will automatically send the final (streaming=false) message.
:param query: user query text
:param session_id: session identifier for agent isolation
:param send_chunk_fn: callable(chunk_data: dict) to send a streaming chunk
"""
agent = self.agent_bridge.get_agent(session_id=session_id)
if agent is None:
raise RuntimeError("Failed to initialise agent for the session")
# State shared between the event callback and this method
state = _StreamState()
def on_event(event: dict):
"""Translate agent events into CHAT protocol chunks."""
event_type = event.get("type")
data = event.get("data", {})
if event_type == "message_update":
# Incremental text delta
delta = data.get("delta", "")
if delta:
send_chunk_fn({
"chunk_type": "content",
"delta": delta,
"segment_id": state.segment_id,
})
elif event_type == "message_end":
# A content segment finished.
tool_calls = data.get("tool_calls", [])
if tool_calls:
# After tool_calls are executed the next content will be
# a new segment; collect tool results until turn_end.
state.pending_tool_results = []
elif event_type == "tool_execution_end":
tool_name = data.get("tool_name", "")
arguments = data.get("arguments", {})
result = data.get("result", "")
status = data.get("status", "unknown")
execution_time = data.get("execution_time", 0)
elapsed_str = f"{execution_time:.2f}s"
# Serialise result to string if needed
if not isinstance(result, str):
import json
try:
result = json.dumps(result, ensure_ascii=False)
except Exception:
result = str(result)
tool_info = {
"name": tool_name,
"arguments": arguments,
"result": result,
"status": status,
"elapsed": elapsed_str,
}
if state.pending_tool_results is not None:
state.pending_tool_results.append(tool_info)
elif event_type == "turn_end":
has_tool_calls = data.get("has_tool_calls", False)
if has_tool_calls and state.pending_tool_results:
# Flush collected tool results as a single tool_calls chunk
send_chunk_fn({
"chunk_type": "tool_calls",
"tool_calls": state.pending_tool_results,
})
state.pending_tool_results = None
# Next content belongs to a new segment
state.segment_id += 1
# Run the agent with our event callback ---------------------------
logger.info(f"[ChatService] Starting agent run: session={session_id}, query={query[:80]}")
from config import conf
max_context_turns = conf().get("agent_max_context_turns", 30)
# Get full system prompt with skills
full_system_prompt = agent.get_full_system_prompt()
# Create a copy of messages for this execution
with agent.messages_lock:
messages_copy = agent.messages.copy()
original_length = len(agent.messages)
from agent.protocol.agent_stream import AgentStreamExecutor
executor = AgentStreamExecutor(
agent=agent,
model=agent.model,
system_prompt=full_system_prompt,
tools=agent.tools,
max_turns=agent.max_steps,
on_event=on_event,
messages=messages_copy,
max_context_turns=max_context_turns,
)
try:
response = executor.run_stream(query)
except Exception:
# If executor cleared messages (context overflow), sync back
if len(executor.messages) == 0:
with agent.messages_lock:
agent.messages.clear()
logger.info("[ChatService] Cleared agent message history after executor recovery")
raise
# Append only the NEW messages from this execution (thread-safe)
with agent.messages_lock:
new_messages = executor.messages[original_length:]
agent.messages.extend(new_messages)
# Store executor reference for files_to_send access
agent.stream_executor = executor
# Execute post-process tools
agent._execute_post_process_tools()
logger.info(f"[ChatService] Agent run completed: session={session_id}")
class _StreamState:
"""Mutable state shared between the event callback and the run method."""
def __init__(self):
self.segment_id: int = 0
# None means we are not accumulating tool results right now.
# A list means we are in the middle of a tool-execution phase.
self.pending_tool_results: Optional[list] = None

167
agent/memory/service.py Normal file
View File

@@ -0,0 +1,167 @@
"""
Memory service for handling memory query operations via cloud protocol.
Provides a unified interface for listing and reading memory files,
callable from the cloud client (LinkAI) or a future web console.
Memory file layout (under workspace_root):
MEMORY.md -> type: global
memory/2026-02-20.md -> type: daily
"""
import os
from datetime import datetime
from typing import Dict, List, Optional
from pathlib import Path
from common.log import logger
class MemoryService:
"""
High-level service for memory file queries.
Operates directly on the filesystem — no MemoryManager dependency.
"""
def __init__(self, workspace_root: str):
"""
:param workspace_root: Workspace root directory (e.g. ~/cow)
"""
self.workspace_root = workspace_root
self.memory_dir = os.path.join(workspace_root, "memory")
# ------------------------------------------------------------------
# list — paginated file metadata
# ------------------------------------------------------------------
def list_files(self, page: int = 1, page_size: int = 20) -> dict:
"""
List all memory files with metadata (without content).
Returns::
{
"page": 1,
"page_size": 20,
"total": 15,
"list": [
{"filename": "MEMORY.md", "type": "global", "size": 2048, "updated_at": "2026-02-20 10:00:00"},
{"filename": "2026-02-20.md", "type": "daily", "size": 512, "updated_at": "2026-02-20 09:30:00"},
...
]
}
"""
files: List[dict] = []
# 1. Global memory — MEMORY.md in workspace root
global_path = os.path.join(self.workspace_root, "MEMORY.md")
if os.path.isfile(global_path):
files.append(self._file_info(global_path, "MEMORY.md", "global"))
# 2. Daily memory files — memory/*.md (sorted newest first)
if os.path.isdir(self.memory_dir):
daily_files = []
for name in os.listdir(self.memory_dir):
full = os.path.join(self.memory_dir, name)
if os.path.isfile(full) and name.endswith(".md"):
daily_files.append((name, full))
# Sort by filename descending (newest date first)
daily_files.sort(key=lambda x: x[0], reverse=True)
for name, full in daily_files:
files.append(self._file_info(full, name, "daily"))
total = len(files)
# Paginate
start = (page - 1) * page_size
end = start + page_size
page_items = files[start:end]
return {
"page": page,
"page_size": page_size,
"total": total,
"list": page_items,
}
# ------------------------------------------------------------------
# content — read a single file
# ------------------------------------------------------------------
def get_content(self, filename: str) -> dict:
"""
Read the full content of a memory file.
:param filename: File name, e.g. ``MEMORY.md`` or ``2026-02-20.md``
:return: dict with ``filename`` and ``content``
:raises FileNotFoundError: if the file does not exist
"""
path = self._resolve_path(filename)
if not os.path.isfile(path):
raise FileNotFoundError(f"Memory file not found: {filename}")
with open(path, "r", encoding="utf-8") as f:
content = f.read()
return {
"filename": filename,
"content": content,
}
# ------------------------------------------------------------------
# dispatch — single entry point for protocol messages
# ------------------------------------------------------------------
def dispatch(self, action: str, payload: Optional[dict] = None) -> dict:
"""
Dispatch a memory management action.
:param action: ``list`` or ``content``
:param payload: action-specific payload
:return: protocol-compatible response dict
"""
payload = payload or {}
try:
if action == "list":
page = payload.get("page", 1)
page_size = payload.get("page_size", 20)
result_payload = self.list_files(page=page, page_size=page_size)
return {"action": action, "code": 200, "message": "success", "payload": result_payload}
elif action == "content":
filename = payload.get("filename")
if not filename:
return {"action": action, "code": 400, "message": "filename is required", "payload": None}
result_payload = self.get_content(filename)
return {"action": action, "code": 200, "message": "success", "payload": result_payload}
else:
return {"action": action, "code": 400, "message": f"unknown action: {action}", "payload": None}
except FileNotFoundError as e:
return {"action": action, "code": 404, "message": str(e), "payload": None}
except Exception as e:
logger.error(f"[MemoryService] dispatch error: action={action}, error={e}")
return {"action": action, "code": 500, "message": str(e), "payload": None}
# ------------------------------------------------------------------
# internal helpers
# ------------------------------------------------------------------
def _resolve_path(self, filename: str) -> str:
"""
Resolve a filename to its absolute path.
- ``MEMORY.md`` → ``{workspace_root}/MEMORY.md``
- ``2026-02-20.md`` → ``{workspace_root}/memory/2026-02-20.md``
"""
if filename == "MEMORY.md":
return os.path.join(self.workspace_root, filename)
return os.path.join(self.memory_dir, filename)
@staticmethod
def _file_info(path: str, filename: str, file_type: str) -> dict:
"""Build a file metadata dict."""
stat = os.stat(path)
updated_at = datetime.fromtimestamp(stat.st_mtime).strftime("%Y-%m-%d %H:%M:%S")
return {
"filename": filename,
"type": file_type,
"size": stat.st_size,
"updated_at": updated_at,
}

View File

@@ -1,4 +1,5 @@
import json
import os
import time
import threading
@@ -61,7 +62,8 @@ class Agent:
# Auto-create skill manager
try:
from agent.skills import SkillManager
self.skill_manager = SkillManager(workspace_dir=workspace_dir)
custom_dir = os.path.join(workspace_dir, "skills") if workspace_dir else None
self.skill_manager = SkillManager(custom_dir=custom_dir)
logger.debug(f"Initialized SkillManager with {len(self.skill_manager.skills)} skills")
except Exception as e:
logger.warning(f"Failed to initialize SkillManager: {e}")

View File

@@ -583,6 +583,11 @@ class AgentStreamExecutor:
if finish_reason:
stop_reason = finish_reason
# Skip reasoning_content (internal thinking from models like GLM-5)
reasoning_delta = delta.get("reasoning_content") or ""
# if reasoning_delta:
# logger.debug(f"🧠 [thinking] {reasoning_delta[:100]}...")
# Handle text content
content_delta = delta.get("content") or ""
if content_delta:

View File

@@ -15,6 +15,7 @@ from agent.skills.types import (
)
from agent.skills.loader import SkillLoader
from agent.skills.manager import SkillManager
from agent.skills.service import SkillService
from agent.skills.formatter import format_skills_for_prompt
__all__ = [
@@ -25,5 +26,6 @@ __all__ = [
"LoadSkillsResult",
"SkillLoader",
"SkillManager",
"SkillService",
"format_skills_for_prompt",
]

View File

@@ -12,25 +12,20 @@ from agent.skills.frontmatter import parse_frontmatter, parse_metadata, parse_bo
class SkillLoader:
"""Loads skills from various directories."""
def __init__(self, workspace_dir: Optional[str] = None):
"""
Initialize the skill loader.
:param workspace_dir: Agent workspace directory (for workspace-specific skills)
"""
self.workspace_dir = workspace_dir
def __init__(self):
pass
def load_skills_from_dir(self, dir_path: str, source: str) -> LoadSkillsResult:
"""
Load skills from a directory.
Discovery rules:
- Direct .md files in the root directory
- Recursive SKILL.md files under subdirectories
:param dir_path: Directory path to scan
:param source: Source identifier (e.g., 'managed', 'workspace', 'bundled')
:param source: Source identifier ('builtin' or 'custom')
:return: LoadSkillsResult with skills and diagnostics
"""
skills = []
@@ -216,61 +211,49 @@ class SkillLoader:
def load_all_skills(
self,
managed_dir: Optional[str] = None,
workspace_skills_dir: Optional[str] = None,
extra_dirs: Optional[List[str]] = None,
builtin_dir: Optional[str] = None,
custom_dir: Optional[str] = None,
) -> Dict[str, SkillEntry]:
"""
Load skills from all configured locations with precedence.
Load skills from builtin and custom directories.
Precedence (lowest to highest):
1. Extra directories
2. Managed skills directory
3. Workspace skills directory
:param managed_dir: Managed skills directory (e.g., ~/.cow/skills)
:param workspace_skills_dir: Workspace skills directory (e.g., workspace/skills)
:param extra_dirs: Additional directories to load skills from
1. builtin — project root ``skills/``, shipped with the codebase
2. custom — workspace ``skills/``, installed via cloud console or skill creator
Same-name custom skills override builtin ones.
:param builtin_dir: Built-in skills directory
:param custom_dir: Custom skills directory
:return: Dictionary mapping skill name to SkillEntry
"""
skill_map: Dict[str, SkillEntry] = {}
all_diagnostics = []
# Load from extra directories (lowest precedence)
if extra_dirs:
for extra_dir in extra_dirs:
if not os.path.exists(extra_dir):
continue
result = self.load_skills_from_dir(extra_dir, source='extra')
all_diagnostics.extend(result.diagnostics)
for skill in result.skills:
entry = self._create_skill_entry(skill)
skill_map[skill.name] = entry
# Load from managed directory
if managed_dir and os.path.exists(managed_dir):
result = self.load_skills_from_dir(managed_dir, source='managed')
# Load builtin skills (lower precedence)
if builtin_dir and os.path.exists(builtin_dir):
result = self.load_skills_from_dir(builtin_dir, source='builtin')
all_diagnostics.extend(result.diagnostics)
for skill in result.skills:
entry = self._create_skill_entry(skill)
skill_map[skill.name] = entry
# Load from workspace directory (highest precedence)
if workspace_skills_dir and os.path.exists(workspace_skills_dir):
result = self.load_skills_from_dir(workspace_skills_dir, source='workspace')
# Load custom skills (higher precedence, overrides builtin)
if custom_dir and os.path.exists(custom_dir):
result = self.load_skills_from_dir(custom_dir, source='custom')
all_diagnostics.extend(result.diagnostics)
for skill in result.skills:
entry = self._create_skill_entry(skill)
skill_map[skill.name] = entry
# Log diagnostics
if all_diagnostics:
logger.debug(f"Skill loading diagnostics: {len(all_diagnostics)} issues")
for diag in all_diagnostics[:5]: # Log first 5
for diag in all_diagnostics[:5]:
logger.debug(f" - {diag}")
logger.debug(f"Loaded {len(skill_map)} skills from all sources")
logger.debug(f"Loaded {len(skill_map)} skills total")
return skill_map
def _create_skill_entry(self, skill: Skill) -> SkillEntry:

View File

@@ -3,6 +3,7 @@ Skill manager for managing skill lifecycle and operations.
"""
import os
import json
from typing import Dict, List, Optional
from pathlib import Path
from common.log import logger
@@ -10,56 +11,131 @@ from agent.skills.types import Skill, SkillEntry, SkillSnapshot
from agent.skills.loader import SkillLoader
from agent.skills.formatter import format_skill_entries_for_prompt
SKILLS_CONFIG_FILE = "skills_config.json"
class SkillManager:
"""Manages skills for an agent."""
def __init__(
self,
workspace_dir: Optional[str] = None,
managed_skills_dir: Optional[str] = None,
extra_dirs: Optional[List[str]] = None,
builtin_dir: Optional[str] = None,
custom_dir: Optional[str] = None,
config: Optional[Dict] = None,
):
"""
Initialize the skill manager.
:param workspace_dir: Agent workspace directory
:param managed_skills_dir: Managed skills directory (e.g., ~/.cow/skills)
:param extra_dirs: Additional skill directories
:param builtin_dir: Built-in skills directory (project root ``skills/``)
:param custom_dir: Custom skills directory (workspace ``skills/``)
:param config: Configuration dictionary
"""
self.workspace_dir = workspace_dir
self.managed_skills_dir = managed_skills_dir or self._get_default_managed_dir()
self.extra_dirs = extra_dirs or []
project_root = os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
self.builtin_dir = builtin_dir or os.path.join(project_root, 'skills')
self.custom_dir = custom_dir or os.path.join(project_root, 'workspace', 'skills')
self.config = config or {}
self.loader = SkillLoader(workspace_dir=workspace_dir)
self._skills_config_path = os.path.join(self.custom_dir, SKILLS_CONFIG_FILE)
# skills_config: full skill metadata keyed by name
# { "web-fetch": {"name": ..., "description": ..., "source": ..., "enabled": true}, ... }
self.skills_config: Dict[str, dict] = {}
self.loader = SkillLoader()
self.skills: Dict[str, SkillEntry] = {}
# Load skills on initialization
self.refresh_skills()
def _get_default_managed_dir(self) -> str:
"""Get the default managed skills directory."""
# Use project root skills directory as default
import os
project_root = os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
return os.path.join(project_root, 'skills')
def refresh_skills(self):
"""Reload all skills from configured directories."""
workspace_skills_dir = None
if self.workspace_dir:
workspace_skills_dir = os.path.join(self.workspace_dir, 'skills')
"""Reload all skills from builtin and custom directories, then sync config."""
self.skills = self.loader.load_all_skills(
managed_dir=self.managed_skills_dir,
workspace_skills_dir=workspace_skills_dir,
extra_dirs=self.extra_dirs,
builtin_dir=self.builtin_dir,
custom_dir=self.custom_dir,
)
self._sync_skills_config()
logger.debug(f"SkillManager: Loaded {len(self.skills)} skills")
# ------------------------------------------------------------------
# skills_config.json management
# ------------------------------------------------------------------
def _load_skills_config(self) -> Dict[str, dict]:
"""Load skills_config.json from custom_dir. Returns empty dict if not found."""
if not os.path.exists(self._skills_config_path):
return {}
try:
with open(self._skills_config_path, "r", encoding="utf-8") as f:
data = json.load(f)
if isinstance(data, dict):
return data
except Exception as e:
logger.warning(f"[SkillManager] Failed to load {SKILLS_CONFIG_FILE}: {e}")
return {}
def _save_skills_config(self):
"""Persist skills_config to custom_dir/skills_config.json."""
os.makedirs(self.custom_dir, exist_ok=True)
try:
with open(self._skills_config_path, "w", encoding="utf-8") as f:
json.dump(self.skills_config, f, indent=4, ensure_ascii=False)
except Exception as e:
logger.error(f"[SkillManager] Failed to save {SKILLS_CONFIG_FILE}: {e}")
def _sync_skills_config(self):
"""
Merge directory-scanned skills with the persisted config file.
- New skills discovered on disk are added with enabled=True.
- Skills that no longer exist on disk are removed.
- Existing entries preserve their enabled state; name/description/source
are refreshed from the latest scan.
"""
saved = self._load_skills_config()
merged: Dict[str, dict] = {}
for name, entry in self.skills.items():
skill = entry.skill
prev = saved.get(name, {})
merged[name] = {
"name": name,
"description": skill.description,
"source": skill.source,
"enabled": prev.get("enabled", True),
}
self.skills_config = merged
self._save_skills_config()
def is_skill_enabled(self, name: str) -> bool:
"""
Check if a skill is enabled according to skills_config.
:param name: skill name
:return: True if enabled (default True if not in config)
"""
entry = self.skills_config.get(name)
if entry is None:
return True
return entry.get("enabled", True)
def set_skill_enabled(self, name: str, enabled: bool):
"""
Set a skill's enabled state and persist.
:param name: skill name
:param enabled: True to enable, False to disable
"""
if name not in self.skills_config:
raise ValueError(f"skill '{name}' not found in config")
self.skills_config[name]["enabled"] = enabled
self._save_skills_config()
def get_skills_config(self) -> Dict[str, dict]:
"""
Return the full skills_config dict (for query API).
:return: copy of skills_config
"""
return dict(self.skills_config)
def get_skill(self, name: str) -> Optional[SkillEntry]:
"""
@@ -85,25 +161,24 @@ class SkillManager:
) -> List[SkillEntry]:
"""
Filter skills based on criteria.
Simple rule: Skills are auto-enabled if requirements are met.
- Has required API keys included
- Missing API keys excluded
- Has required API keys -> included
- Missing API keys -> excluded
:param skill_filter: List of skill names to include (None = all)
:param include_disabled: Whether to include skills with disable_model_invocation=True
:param include_disabled: Whether to include disabled skills
:return: Filtered list of skill entries
"""
from agent.skills.config import should_include_skill
entries = list(self.skills.values())
# Check requirements (platform, binaries, env vars)
entries = [e for e in entries if should_include_skill(e, self.config)]
# Apply skill filter
if skill_filter is not None:
# Flatten and normalize skill names (handle both strings and nested lists)
normalized = []
for item in skill_filter:
if isinstance(item, str):
@@ -111,20 +186,18 @@ class SkillManager:
if name:
normalized.append(name)
elif isinstance(item, list):
# Handle nested lists
for subitem in item:
if isinstance(subitem, str):
name = subitem.strip()
if name:
normalized.append(name)
if normalized:
entries = [e for e in entries if e.skill.name in normalized]
# Filter out disabled skills unless explicitly requested
# Filter out disabled skills based on skills_config.json
if not include_disabled:
entries = [e for e in entries if not e.skill.disable_model_invocation]
entries = [e for e in entries if self.is_skill_enabled(e.skill.name)]
return entries
def build_skills_prompt(

204
agent/skills/service.py Normal file
View File

@@ -0,0 +1,204 @@
"""
Skill service for handling skill CRUD operations.
This service provides a unified interface for managing skills, which can be
called from the cloud control client (LinkAI), the local web console, or any
other management entry point.
"""
import os
import shutil
from typing import Dict, List, Optional
from common.log import logger
from agent.skills.types import Skill, SkillEntry
from agent.skills.manager import SkillManager
try:
import requests
except ImportError:
requests = None
class SkillService:
"""
High-level service for skill lifecycle management.
Wraps SkillManager and provides network-aware operations such as
downloading skill files from remote URLs.
"""
def __init__(self, skill_manager: SkillManager):
"""
:param skill_manager: The SkillManager instance to operate on
"""
self.manager = skill_manager
# ------------------------------------------------------------------
# query
# ------------------------------------------------------------------
def query(self) -> List[dict]:
"""
Query all skills and return a serialisable list.
Reads from skills_config.json (refreshes from disk if needed).
:return: list of skill info dicts
"""
self.manager.refresh_skills()
config = self.manager.get_skills_config()
result = list(config.values())
logger.info(f"[SkillService] query: {len(result)} skills found")
return result
# ------------------------------------------------------------------
# add / install
# ------------------------------------------------------------------
def add(self, payload: dict) -> None:
"""
Add (install) a skill from a remote payload.
The payload follows the socket protocol::
{
"name": "web_search",
"type": "url",
"enabled": true,
"files": [
{"url": "https://...", "path": "README.md"},
{"url": "https://...", "path": "scripts/main.py"}
]
}
Files are downloaded and saved under the custom skills directory
using *name* as the sub-directory.
:param payload: skill add payload from server
"""
name = payload.get("name")
if not name:
raise ValueError("skill name is required")
files = payload.get("files", [])
if not files:
raise ValueError("skill files list is empty")
skill_dir = os.path.join(self.manager.custom_dir, name)
os.makedirs(skill_dir, exist_ok=True)
for file_info in files:
url = file_info.get("url")
rel_path = file_info.get("path")
if not url or not rel_path:
logger.warning(f"[SkillService] add: skip invalid file entry {file_info}")
continue
dest = os.path.join(skill_dir, rel_path)
self._download_file(url, dest)
# Reload to pick up the new skill and sync config
self.manager.refresh_skills()
logger.info(f"[SkillService] add: skill '{name}' installed ({len(files)} files)")
# ------------------------------------------------------------------
# open / close (enable / disable)
# ------------------------------------------------------------------
def open(self, payload: dict) -> None:
"""
Enable a skill by name.
:param payload: {"name": "skill_name"}
"""
name = payload.get("name")
if not name:
raise ValueError("skill name is required")
self.manager.set_skill_enabled(name, enabled=True)
logger.info(f"[SkillService] open: skill '{name}' enabled")
def close(self, payload: dict) -> None:
"""
Disable a skill by name.
:param payload: {"name": "skill_name"}
"""
name = payload.get("name")
if not name:
raise ValueError("skill name is required")
self.manager.set_skill_enabled(name, enabled=False)
logger.info(f"[SkillService] close: skill '{name}' disabled")
# ------------------------------------------------------------------
# delete
# ------------------------------------------------------------------
def delete(self, payload: dict) -> None:
"""
Delete a skill by removing its directory entirely.
:param payload: {"name": "skill_name"}
"""
name = payload.get("name")
if not name:
raise ValueError("skill name is required")
skill_dir = os.path.join(self.manager.custom_dir, name)
if os.path.exists(skill_dir):
shutil.rmtree(skill_dir)
logger.info(f"[SkillService] delete: removed directory {skill_dir}")
else:
logger.warning(f"[SkillService] delete: skill directory not found: {skill_dir}")
# Refresh will remove the deleted skill from config automatically
self.manager.refresh_skills()
logger.info(f"[SkillService] delete: skill '{name}' deleted")
# ------------------------------------------------------------------
# dispatch - single entry point for protocol messages
# ------------------------------------------------------------------
def dispatch(self, action: str, payload: Optional[dict] = None) -> dict:
"""
Dispatch a skill management action and return a protocol-compatible
response dict.
:param action: one of query / add / open / close / delete
:param payload: action-specific payload (may be None for query)
:return: dict with action, code, message, payload
"""
payload = payload or {}
try:
if action == "query":
result_payload = self.query()
return {"action": action, "code": 200, "message": "success", "payload": result_payload}
elif action == "add":
self.add(payload)
elif action == "open":
self.open(payload)
elif action == "close":
self.close(payload)
elif action == "delete":
self.delete(payload)
else:
return {"action": action, "code": 400, "message": f"unknown action: {action}", "payload": None}
return {"action": action, "code": 200, "message": "success", "payload": None}
except Exception as e:
logger.error(f"[SkillService] dispatch error: action={action}, error={e}")
return {"action": action, "code": 500, "message": str(e), "payload": None}
# ------------------------------------------------------------------
# internal helpers
# ------------------------------------------------------------------
@staticmethod
def _download_file(url: str, dest: str):
"""
Download a file from *url* and save to *dest*.
:param url: remote file URL
:param dest: local destination path
"""
if requests is None:
raise RuntimeError("requests library is required for downloading skill files")
dest_dir = os.path.dirname(dest)
if dest_dir:
os.makedirs(dest_dir, exist_ok=True)
resp = requests.get(url, timeout=60)
resp.raise_for_status()
with open(dest, "wb") as f:
f.write(resp.content)
logger.debug(f"[SkillService] downloaded {url} -> {dest}")

View File

@@ -45,7 +45,7 @@ class Skill:
description: str
file_path: str
base_dir: str
source: str # managed, workspace, bundled, etc.
source: str # builtin or custom
content: str # Full markdown content
disable_model_invocation: bool = False
frontmatter: Dict[str, Any] = field(default_factory=dict)

162
app.py
View File

@@ -7,11 +7,152 @@ import time
from channel import channel_factory
from common import const
from config import load_config
from common.log import logger
from config import load_config, conf
from plugins import *
import threading
# Global channel manager for restart support
_channel_mgr = None
def get_channel_manager():
return _channel_mgr
class ChannelManager:
"""
Manage the lifecycle of a channel, supporting restart from sub-threads.
The channel.startup() runs in a daemon thread so that the main thread
remains available and a new channel can be started at any time.
"""
def __init__(self):
self._channel = None
self._channel_thread = None
self._lock = threading.Lock()
@property
def channel(self):
return self._channel
def start(self, channel_name: str, first_start: bool = False):
"""
Create and start a channel in a sub-thread.
If first_start is True, plugins and linkai client will also be initialized.
"""
with self._lock:
channel = channel_factory.create_channel(channel_name)
self._channel = channel
if first_start:
if channel_name in ["wx", "wxy", "terminal", "wechatmp", "web",
"wechatmp_service", "wechatcom_app", "wework",
const.FEISHU, const.DINGTALK]:
PluginManager().load_plugins()
if conf().get("use_linkai"):
try:
from common import cloud_client
threading.Thread(target=cloud_client.start, args=(channel, self), daemon=True).start()
except Exception as e:
pass
# Run channel.startup() in a daemon thread so we can restart later
self._channel_thread = threading.Thread(
target=self._run_channel, args=(channel,), daemon=True
)
self._channel_thread.start()
logger.debug(f"[ChannelManager] Channel '{channel_name}' started in sub-thread")
def _run_channel(self, channel):
try:
channel.startup()
except Exception as e:
logger.error(f"[ChannelManager] Channel startup error: {e}")
logger.exception(e)
def stop(self):
"""
Stop the current channel. Since most channel startup() methods block
on an HTTP server or stream client, we stop by terminating the thread.
"""
with self._lock:
if self._channel is None:
return
channel_type = getattr(self._channel, 'channel_type', 'unknown')
logger.info(f"[ChannelManager] Stopping channel '{channel_type}'...")
# Try graceful stop if channel implements it
try:
if hasattr(self._channel, 'stop'):
self._channel.stop()
except Exception as e:
logger.warning(f"[ChannelManager] Error during channel stop: {e}")
self._channel = None
self._channel_thread = None
def restart(self, new_channel_name: str):
"""
Restart the channel with a new channel type.
Can be called from any thread (e.g. linkai config callback).
"""
logger.info(f"[ChannelManager] Restarting channel to '{new_channel_name}'...")
self.stop()
# Clear singleton cache so a fresh channel instance is created
_clear_singleton_cache(new_channel_name)
time.sleep(1) # Brief pause to allow resources to release
self.start(new_channel_name, first_start=False)
logger.info(f"[ChannelManager] Channel restarted to '{new_channel_name}' successfully")
def _clear_singleton_cache(channel_name: str):
"""
Clear the singleton cache for the channel class so that
a new instance can be created with updated config.
"""
cls_map = {
"wx": "channel.wechat.wechat_channel.WechatChannel",
"wxy": "channel.wechat.wechaty_channel.WechatyChannel",
"wcf": "channel.wechat.wcf_channel.WechatfChannel",
"web": "channel.web.web_channel.WebChannel",
"wechatmp": "channel.wechatmp.wechatmp_channel.WechatMPChannel",
"wechatmp_service": "channel.wechatmp.wechatmp_channel.WechatMPChannel",
"wechatcom_app": "channel.wechatcom.wechatcomapp_channel.WechatComAppChannel",
"wework": "channel.wework.wework_channel.WeworkChannel",
const.FEISHU: "channel.feishu.feishu_channel.FeiShuChanel",
const.DINGTALK: "channel.dingtalk.dingtalk_channel.DingTalkChanel",
}
module_path = cls_map.get(channel_name)
if not module_path:
return
# The singleton decorator stores instances in a closure dict keyed by class.
# We need to find the actual class and clear it from the closure.
try:
parts = module_path.rsplit(".", 1)
module_name, class_name = parts[0], parts[1]
import importlib
module = importlib.import_module(module_name)
# The module-level name is the wrapper function from @singleton
wrapper = getattr(module, class_name, None)
if wrapper and hasattr(wrapper, '__closure__') and wrapper.__closure__:
for cell in wrapper.__closure__:
try:
cell_contents = cell.cell_contents
if isinstance(cell_contents, dict):
cell_contents.clear()
logger.debug(f"[ChannelManager] Cleared singleton cache for {class_name}")
break
except ValueError:
pass
except Exception as e:
logger.warning(f"[ChannelManager] Failed to clear singleton cache: {e}")
def sigterm_handler_wrap(_signo):
old_handler = signal.getsignal(_signo)
@@ -25,22 +166,8 @@ def sigterm_handler_wrap(_signo):
signal.signal(_signo, func)
def start_channel(channel_name: str):
channel = channel_factory.create_channel(channel_name)
if channel_name in ["wx", "wxy", "terminal", "wechatmp", "web", "wechatmp_service", "wechatcom_app", "wework",
const.FEISHU, const.DINGTALK]:
PluginManager().load_plugins()
if conf().get("use_linkai"):
try:
from common import linkai_client
threading.Thread(target=linkai_client.start, args=(channel,)).start()
except Exception as e:
pass
channel.startup()
def run():
global _channel_mgr
try:
# load config
load_config()
@@ -58,7 +185,8 @@ def run():
if channel_name == "wxy":
os.environ["WECHATY_LOG"] = "warn"
start_channel(channel_name)
_channel_mgr = ChannelManager()
_channel_mgr.start(channel_name, first_start=True)
while True:
time.sleep(1)

View File

@@ -28,7 +28,7 @@ def add_openai_compatible_support(bot_instance):
"""
if hasattr(bot_instance, 'call_with_tools'):
# Bot already has tool calling support (e.g., ZHIPUAIBot)
logger.info(f"[AgentBridge] {type(bot_instance).__name__} already has native tool calling support")
logger.debug(f"[AgentBridge] {type(bot_instance).__name__} already has native tool calling support")
return bot_instance
# Create a temporary mixin class that combines the bot with OpenAI compatibility

View File

@@ -74,7 +74,7 @@ class AgentEventHandler:
# Only send thinking process if followed by tool calls
if tool_calls:
if self.current_thinking.strip():
logger.debug(f"💭 {self.current_thinking.strip()[:200]}{'...' if len(self.current_thinking) > 200 else ''}")
logger.info(f"💭 {self.current_thinking.strip()[:200]}{'...' if len(self.current_thinking) > 200 else ''}")
# Send thinking process to channel
self._send_to_channel(f"{self.current_thinking.strip()}")
else:

View File

@@ -291,7 +291,7 @@ class AgentInitializer:
"""Initialize skill manager"""
try:
from agent.skills import SkillManager
skill_manager = SkillManager(workspace_dir=workspace_root)
skill_manager = SkillManager(custom_dir=os.path.join(workspace_root, "skills"))
return skill_manager
except Exception as e:
logger.warning(f"[AgentInitializer] Failed to initialize SkillManager: {e}")

View File

@@ -55,6 +55,11 @@ class Bridge(object):
if model_type in [const.MOONSHOT, "moonshot-v1-8k", "moonshot-v1-32k", "moonshot-v1-128k"]:
self.btype["chat"] = const.MOONSHOT
if model_type and model_type.startswith("kimi"):
self.btype["chat"] = const.MOONSHOT
if model_type and model_type.startswith("doubao"):
self.btype["chat"] = const.DOUBAO
if model_type in [const.MODELSCOPE]:
self.btype["chat"] = const.MODELSCOPE

View File

@@ -19,6 +19,12 @@ class Channel(object):
"""
raise NotImplementedError
def stop(self):
"""
stop channel gracefully, called before restart
"""
pass
def handle_text(self, msg):
"""
process received msg

View File

@@ -90,13 +90,9 @@ class DingTalkChanel(ChatChannel, dingtalk_stream.ChatbotHandler):
dingtalk_client_secret = conf().get('dingtalk_client_secret')
def setup_logger(self):
logger = logging.getLogger()
handler = logging.StreamHandler()
handler.setFormatter(
logging.Formatter('%(asctime)s %(name)-8s %(levelname)-8s %(message)s [%(filename)s:%(lineno)d]'))
logger.addHandler(handler)
logger.setLevel(logging.INFO)
return logger
# Suppress verbose logs from dingtalk_stream SDK
logging.getLogger("dingtalk_stream").setLevel(logging.WARNING)
return logging.getLogger("DingTalk")
def __init__(self):
super().__init__()
@@ -104,6 +100,7 @@ class DingTalkChanel(ChatChannel, dingtalk_stream.ChatbotHandler):
self.logger = self.setup_logger()
# 历史消息id暂存用于幂等控制
self.receivedMsgs = ExpiredDict(conf().get("expires_in_seconds", 3600))
self._stream_client = None
logger.debug("[DingTalk] client_id={}, client_secret={} ".format(
self.dingtalk_client_id, self.dingtalk_client_secret))
# 无需群校验和前缀
@@ -119,9 +116,19 @@ class DingTalkChanel(ChatChannel, dingtalk_stream.ChatbotHandler):
def startup(self):
credential = dingtalk_stream.Credential(self.dingtalk_client_id, self.dingtalk_client_secret)
client = dingtalk_stream.DingTalkStreamClient(credential)
self._stream_client = client
client.register_callback_handler(dingtalk_stream.chatbot.ChatbotMessage.TOPIC, self)
logger.info("[DingTalk] ✅ Stream connected, ready to receive messages")
client.start_forever()
def stop(self):
if self._stream_client:
try:
self._stream_client.stop()
logger.info("[DingTalk] Stream client stopped")
except Exception as e:
logger.warning(f"[DingTalk] Error stopping stream client: {e}")
self._stream_client = None
def get_access_token(self):
"""

View File

@@ -12,6 +12,7 @@
"""
import json
import logging
import os
import ssl
import threading
@@ -32,6 +33,9 @@ from common.log import logger
from common.singleton import singleton
from config import conf
# Suppress verbose logs from Lark SDK
logging.getLogger("Lark").setLevel(logging.WARNING)
URL_VERIFICATION = "url_verification"
# 尝试导入飞书SDK,如果未安装则websocket模式不可用
@@ -56,6 +60,7 @@ class FeiShuChanel(ChatChannel):
super().__init__()
# 历史消息id暂存用于幂等控制
self.receivedMsgs = ExpiredDict(60 * 60 * 7.1)
self._http_server = None
logger.debug("[FeiShu] app_id={}, app_secret={}, verification_token={}, event_mode={}".format(
self.feishu_app_id, self.feishu_app_secret, self.feishu_token, self.feishu_event_mode))
# 无需群校验和前缀
@@ -73,6 +78,15 @@ class FeiShuChanel(ChatChannel):
else:
self._startup_webhook()
def stop(self):
if self._http_server:
try:
self._http_server.stop()
logger.info("[FeiShu] HTTP server stopped")
except Exception as e:
logger.warning(f"[FeiShu] Error stopping HTTP server: {e}")
self._http_server = None
def _startup_webhook(self):
"""启动HTTP服务器接收事件(webhook模式)"""
logger.debug("[FeiShu] Starting in webhook mode...")
@@ -81,7 +95,14 @@ class FeiShuChanel(ChatChannel):
)
app = web.application(urls, globals(), autoreload=False)
port = conf().get("feishu_port", 9891)
web.httpserver.runsimple(app.wsgifunc(), ("0.0.0.0", port))
func = web.httpserver.StaticMiddleware(app.wsgifunc())
func = web.httpserver.LogMiddleware(func)
server = web.httpserver.WSGIServer(("0.0.0.0", port), func)
self._http_server = server
try:
server.start()
except (KeyboardInterrupt, SystemExit):
server.stop()
def _startup_websocket(self):
"""启动长连接接收事件(websocket模式)"""
@@ -138,7 +159,7 @@ class FeiShuChanel(ChatChannel):
self.feishu_app_id,
self.feishu_app_secret,
event_handler=event_handler,
log_level=lark.LogLevel.DEBUG if conf().get("debug") else lark.LogLevel.INFO
log_level=lark.LogLevel.DEBUG if conf().get("debug") else lark.LogLevel.WARNING
)
logger.debug("[FeiShu] Websocket client starting...")

View File

@@ -50,6 +50,7 @@ class WebChannel(ChatChannel):
self.msg_id_counter = 0 # 添加消息ID计数器
self.session_queues = {} # 存储session_id到队列的映射
self.request_to_session = {} # 存储request_id到session_id的映射
self._http_server = None
def _generate_msg_id(self):
@@ -235,13 +236,24 @@ class WebChannel(ChatChannel):
logging.getLogger("web").setLevel(logging.ERROR)
logging.getLogger("web.httpserver").setLevel(logging.ERROR)
# 抑制 web.py 默认的服务器启动消息
old_stdout = sys.stdout
sys.stdout = io.StringIO()
# Build WSGI app with middleware (same as runsimple but without print)
func = web.httpserver.StaticMiddleware(app.wsgifunc())
func = web.httpserver.LogMiddleware(func)
server = web.httpserver.WSGIServer(("0.0.0.0", port), func)
self._http_server = server
try:
web.httpserver.runsimple(app.wsgifunc(), ("0.0.0.0", port))
finally:
sys.stdout = old_stdout
server.start()
except (KeyboardInterrupt, SystemExit):
server.stop()
def stop(self):
if self._http_server:
try:
self._http_server.stop()
logger.info("[WebChannel] HTTP server stopped")
except Exception as e:
logger.warning(f"[WebChannel] Error stopping HTTP server: {e}")
self._http_server = None
class RootHandler:

View File

@@ -151,7 +151,7 @@ class WechatChannel(ChatChannel):
def exitCallback(self):
try:
from common.linkai_client import chat_client
from common.cloud_client import chat_client
if chat_client.client_id and conf().get("use_linkai"):
_send_logout()
time.sleep(2)
@@ -283,7 +283,7 @@ class WechatChannel(ChatChannel):
def _send_login_success():
try:
from common.linkai_client import chat_client
from common.cloud_client import chat_client
if chat_client.client_id:
chat_client.send_login_success()
except Exception as e:
@@ -292,7 +292,7 @@ def _send_login_success():
def _send_logout():
try:
from common.linkai_client import chat_client
from common.cloud_client import chat_client
if chat_client.client_id:
chat_client.send_logout()
except Exception as e:
@@ -301,7 +301,7 @@ def _send_logout():
def _send_qr_code(qrcode_list: list):
try:
from common.linkai_client import chat_client
from common.cloud_client import chat_client
if chat_client.client_id:
chat_client.send_qrcode(qrcode_list)
except Exception as e:

View File

@@ -36,6 +36,7 @@ class WechatComAppChannel(ChatChannel):
self.agent_id = conf().get("wechatcomapp_agent_id")
self.token = conf().get("wechatcomapp_token")
self.aes_key = conf().get("wechatcomapp_aes_key")
self._http_server = None
logger.info(
"[wechatcom] Initializing WeCom app channel, corp_id: {}, agent_id: {}".format(self.corp_id, self.agent_id)
)
@@ -51,13 +52,24 @@ class WechatComAppChannel(ChatChannel):
logger.info("[wechatcom] 📡 Listening on http://0.0.0.0:{}/wxcomapp/".format(port))
logger.info("[wechatcom] 🤖 Ready to receive messages")
# Suppress web.py's default server startup message
old_stdout = sys.stdout
sys.stdout = io.StringIO()
# Build WSGI app with middleware (same as runsimple but without print)
func = web.httpserver.StaticMiddleware(app.wsgifunc())
func = web.httpserver.LogMiddleware(func)
server = web.httpserver.WSGIServer(("0.0.0.0", port), func)
self._http_server = server
try:
web.httpserver.runsimple(app.wsgifunc(), ("0.0.0.0", port))
finally:
sys.stdout = old_stdout
server.start()
except (KeyboardInterrupt, SystemExit):
server.stop()
def stop(self):
if self._http_server:
try:
self._http_server.stop()
logger.info("[wechatcom] HTTP server stopped")
except Exception as e:
logger.warning(f"[wechatcom] Error stopping HTTP server: {e}")
self._http_server = None
def send(self, reply: Reply, context: Context):
receiver = context["receiver"]

View File

@@ -41,6 +41,7 @@ class WechatMPChannel(ChatChannel):
super().__init__()
self.passive_reply = passive_reply
self.NOT_SUPPORT_REPLYTYPE = []
self._http_server = None
appid = conf().get("wechatmp_app_id")
secret = conf().get("wechatmp_app_secret")
token = conf().get("wechatmp_token")
@@ -69,7 +70,23 @@ class WechatMPChannel(ChatChannel):
urls = ("/wx", "channel.wechatmp.active_reply.Query")
app = web.application(urls, globals(), autoreload=False)
port = conf().get("wechatmp_port", 8080)
web.httpserver.runsimple(app.wsgifunc(), ("0.0.0.0", port))
func = web.httpserver.StaticMiddleware(app.wsgifunc())
func = web.httpserver.LogMiddleware(func)
server = web.httpserver.WSGIServer(("0.0.0.0", port), func)
self._http_server = server
try:
server.start()
except (KeyboardInterrupt, SystemExit):
server.stop()
def stop(self):
if self._http_server:
try:
self._http_server.stop()
logger.info("[wechatmp] HTTP server stopped")
except Exception as e:
logger.warning(f"[wechatmp] Error stopping HTTP server: {e}")
self._http_server = None
def start_loop(self, loop):
asyncio.set_event_loop(loop)

View File

@@ -20,7 +20,6 @@ from common.utils import compress_imgfile, fsize
from config import conf
from channel.wework.run import wework
from channel.wework import run
from PIL import Image
def get_wxid_by_name(room_members, group_wxid, name):
@@ -55,6 +54,7 @@ def download_and_compress_image(url, filename, quality=30):
image_storage.seek(0)
# 读取并保存图片
from PIL import Image
image = Image.open(image_storage)
image_path = os.path.join(directory, f"{filename}.png")
image.save(image_path, "png")

375
common/cloud_client.py Normal file
View File

@@ -0,0 +1,375 @@
"""
Cloud management client for connecting to the LinkAI control console.
Handles remote configuration sync, message push, and skill management
via the LinkAI socket protocol.
"""
from bridge.context import Context, ContextType
from bridge.reply import Reply, ReplyType
from common.log import logger
from linkai import LinkAIClient, PushMsg
from config import conf, pconf, plugin_config, available_setting, write_plugin_config, get_root
from plugins import PluginManager
import threading
import time
import json
import os
chat_client: LinkAIClient
class CloudClient(LinkAIClient):
def __init__(self, api_key: str, channel, host: str = ""):
super().__init__(api_key, host)
self.channel = channel
self.client_type = channel.channel_type
self.channel_mgr = None
self._skill_service = None
self._memory_service = None
self._chat_service = None
@property
def skill_service(self):
"""Lazy-init SkillService so it is available once SkillManager exists."""
if self._skill_service is None:
try:
from agent.skills.manager import SkillManager
from agent.skills.service import SkillService
from config import conf
from common.utils import expand_path
workspace_root = expand_path(conf().get("agent_workspace", "~/cow"))
manager = SkillManager(custom_dir=os.path.join(workspace_root, "skills"))
self._skill_service = SkillService(manager)
logger.debug("[CloudClient] SkillService initialised")
except Exception as e:
logger.error(f"[CloudClient] Failed to init SkillService: {e}")
return self._skill_service
@property
def memory_service(self):
"""Lazy-init MemoryService."""
if self._memory_service is None:
try:
from agent.memory.service import MemoryService
from config import conf
from common.utils import expand_path
workspace_root = expand_path(conf().get("agent_workspace", "~/cow"))
self._memory_service = MemoryService(workspace_root)
logger.debug("[CloudClient] MemoryService initialised")
except Exception as e:
logger.error(f"[CloudClient] Failed to init MemoryService: {e}")
return self._memory_service
@property
def chat_service(self):
"""Lazy-init ChatService (requires AgentBridge via Bridge singleton)."""
if self._chat_service is None:
try:
from agent.chat.service import ChatService
from bridge.bridge import Bridge
agent_bridge = Bridge().get_agent_bridge()
self._chat_service = ChatService(agent_bridge)
logger.debug("[CloudClient] ChatService initialised")
except Exception as e:
logger.error(f"[CloudClient] Failed to init ChatService: {e}")
return self._chat_service
# ------------------------------------------------------------------
# message push callback
# ------------------------------------------------------------------
def on_message(self, push_msg: PushMsg):
session_id = push_msg.session_id
msg_content = push_msg.msg_content
logger.info(f"receive msg push, session_id={session_id}, msg_content={msg_content}")
context = Context()
context.type = ContextType.TEXT
context["receiver"] = session_id
context["isgroup"] = push_msg.is_group
self.channel.send(Reply(ReplyType.TEXT, content=msg_content), context)
# ------------------------------------------------------------------
# config callback
# ------------------------------------------------------------------
def on_config(self, config: dict):
if not self.client_id:
return
logger.info(f"[CloudClient] Loading remote config: {config}")
if config.get("enabled") != "Y":
return
local_config = conf()
need_restart_channel = False
for key in config.keys():
if key in available_setting and config.get(key) is not None:
local_config[key] = config.get(key)
# Voice settings
reply_voice_mode = config.get("reply_voice_mode")
if reply_voice_mode:
if reply_voice_mode == "voice_reply_voice":
local_config["voice_reply_voice"] = True
local_config["always_reply_voice"] = False
elif reply_voice_mode == "always_reply_voice":
local_config["always_reply_voice"] = True
local_config["voice_reply_voice"] = True
elif reply_voice_mode == "no_reply_voice":
local_config["always_reply_voice"] = False
local_config["voice_reply_voice"] = False
# Model configuration
if config.get("model"):
local_config["model"] = config.get("model")
# Channel configuration
if config.get("channelType"):
if local_config.get("channel_type") != config.get("channelType"):
local_config["channel_type"] = config.get("channelType")
need_restart_channel = True
# Channel-specific app credentials
current_channel_type = local_config.get("channel_type", "")
if config.get("app_id") is not None:
if current_channel_type == "feishu":
if local_config.get("feishu_app_id") != config.get("app_id"):
local_config["feishu_app_id"] = config.get("app_id")
need_restart_channel = True
elif current_channel_type == "dingtalk":
if local_config.get("dingtalk_client_id") != config.get("app_id"):
local_config["dingtalk_client_id"] = config.get("app_id")
need_restart_channel = True
elif current_channel_type in ("wechatmp", "wechatmp_service"):
if local_config.get("wechatmp_app_id") != config.get("app_id"):
local_config["wechatmp_app_id"] = config.get("app_id")
need_restart_channel = True
elif current_channel_type == "wechatcom_app":
if local_config.get("wechatcomapp_agent_id") != config.get("app_id"):
local_config["wechatcomapp_agent_id"] = config.get("app_id")
need_restart_channel = True
if config.get("app_secret"):
if current_channel_type == "feishu":
if local_config.get("feishu_app_secret") != config.get("app_secret"):
local_config["feishu_app_secret"] = config.get("app_secret")
need_restart_channel = True
elif current_channel_type == "dingtalk":
if local_config.get("dingtalk_client_secret") != config.get("app_secret"):
local_config["dingtalk_client_secret"] = config.get("app_secret")
need_restart_channel = True
elif current_channel_type in ("wechatmp", "wechatmp_service"):
if local_config.get("wechatmp_app_secret") != config.get("app_secret"):
local_config["wechatmp_app_secret"] = config.get("app_secret")
need_restart_channel = True
elif current_channel_type == "wechatcom_app":
if local_config.get("wechatcomapp_secret") != config.get("app_secret"):
local_config["wechatcomapp_secret"] = config.get("app_secret")
need_restart_channel = True
if config.get("admin_password"):
if not pconf("Godcmd"):
write_plugin_config({"Godcmd": {"password": config.get("admin_password"), "admin_users": []}})
else:
pconf("Godcmd")["password"] = config.get("admin_password")
PluginManager().instances["GODCMD"].reload()
if config.get("group_app_map") and pconf("linkai"):
local_group_map = {}
for mapping in config.get("group_app_map"):
local_group_map[mapping.get("group_name")] = mapping.get("app_code")
pconf("linkai")["group_app_map"] = local_group_map
PluginManager().instances["LINKAI"].reload()
if config.get("text_to_image") and config.get("text_to_image") == "midjourney" and pconf("linkai"):
if pconf("linkai")["midjourney"]:
pconf("linkai")["midjourney"]["enabled"] = True
pconf("linkai")["midjourney"]["use_image_create_prefix"] = True
elif config.get("text_to_image") and config.get("text_to_image") in ["dall-e-2", "dall-e-3"]:
if pconf("linkai")["midjourney"]:
pconf("linkai")["midjourney"]["use_image_create_prefix"] = False
# Save configuration to config.json file
self._save_config_to_file(local_config)
if need_restart_channel:
self._restart_channel(local_config.get("channel_type", ""))
# ------------------------------------------------------------------
# skill callback
# ------------------------------------------------------------------
def on_skill(self, data: dict) -> dict:
"""
Handle SKILL messages from the cloud console.
Delegates to SkillService.dispatch for the actual operations.
:param data: message data with 'action', 'clientId', 'payload'
:return: response dict
"""
action = data.get("action", "")
payload = data.get("payload")
logger.info(f"[CloudClient] on_skill: action={action}")
svc = self.skill_service
if svc is None:
return {"action": action, "code": 500, "message": "SkillService not available", "payload": None}
return svc.dispatch(action, payload)
# ------------------------------------------------------------------
# memory callback
# ------------------------------------------------------------------
def on_memory(self, data: dict) -> dict:
"""
Handle MEMORY messages from the cloud console.
Delegates to MemoryService.dispatch for the actual operations.
:param data: message data with 'action', 'clientId', 'payload'
:return: response dict
"""
action = data.get("action", "")
payload = data.get("payload")
logger.info(f"[CloudClient] on_memory: action={action}")
svc = self.memory_service
if svc is None:
return {"action": action, "code": 500, "message": "MemoryService not available", "payload": None}
return svc.dispatch(action, payload)
# ------------------------------------------------------------------
# chat callback
# ------------------------------------------------------------------
def on_chat(self, data: dict, send_chunk_fn):
"""
Handle CHAT messages from the cloud console.
Runs the agent in streaming mode and sends chunks back via send_chunk_fn.
:param data: message data with 'action' and 'payload' (query, session_id)
:param send_chunk_fn: callable(chunk_data: dict) to send one streaming chunk
"""
payload = data.get("payload", {})
query = payload.get("query", "")
session_id = payload.get("session_id", "cloud_console")
logger.info(f"[CloudClient] on_chat: session={session_id}, query={query[:80]}")
svc = self.chat_service
if svc is None:
raise RuntimeError("ChatService not available")
svc.run(query=query, session_id=session_id, send_chunk_fn=send_chunk_fn)
# ------------------------------------------------------------------
# channel restart helpers
# ------------------------------------------------------------------
def _restart_channel(self, new_channel_type: str):
"""
Restart the channel via ChannelManager when channel type changes.
"""
if self.channel_mgr:
logger.info(f"[CloudClient] Restarting channel to '{new_channel_type}'...")
threading.Thread(target=self._do_restart_channel, args=(self.channel_mgr, new_channel_type), daemon=True).start()
else:
logger.warning("[CloudClient] ChannelManager not available, please restart the application manually")
def _do_restart_channel(self, mgr, new_channel_type: str):
"""
Perform the channel restart in a separate thread to avoid blocking the config callback.
"""
try:
mgr.restart(new_channel_type)
# Update the client's channel reference
if mgr.channel:
self.channel = mgr.channel
self.client_type = mgr.channel.channel_type
logger.info(f"[CloudClient] Channel reference updated to '{new_channel_type}'")
except Exception as e:
logger.error(f"[CloudClient] Channel restart failed: {e}")
# ------------------------------------------------------------------
# config persistence
# ------------------------------------------------------------------
def _save_config_to_file(self, local_config: dict):
"""
Save configuration to config.json file.
"""
try:
config_path = os.path.join(get_root(), "config.json")
if not os.path.exists(config_path):
logger.warning(f"[CloudClient] config.json not found at {config_path}, skip saving")
return
with open(config_path, "r", encoding="utf-8") as f:
file_config = json.load(f)
file_config.update(dict(local_config))
with open(config_path, "w", encoding="utf-8") as f:
json.dump(file_config, f, indent=4, ensure_ascii=False)
logger.info("[CloudClient] Configuration saved to config.json successfully")
except Exception as e:
logger.error(f"[CloudClient] Failed to save configuration to config.json: {e}")
def start(channel, channel_mgr=None):
global chat_client
chat_client = CloudClient(api_key=conf().get("linkai_api_key"), host=conf().get("cloud_host", ""), channel=channel)
chat_client.channel_mgr = channel_mgr
chat_client.config = _build_config()
chat_client.start()
time.sleep(1.5)
if chat_client.client_id:
logger.info("[CloudClient] Console: https://link-ai.tech/console/clients")
def _build_config():
local_conf = conf()
config = {
"linkai_app_code": local_conf.get("linkai_app_code"),
"single_chat_prefix": local_conf.get("single_chat_prefix"),
"single_chat_reply_prefix": local_conf.get("single_chat_reply_prefix"),
"single_chat_reply_suffix": local_conf.get("single_chat_reply_suffix"),
"group_chat_prefix": local_conf.get("group_chat_prefix"),
"group_chat_reply_prefix": local_conf.get("group_chat_reply_prefix"),
"group_chat_reply_suffix": local_conf.get("group_chat_reply_suffix"),
"group_name_white_list": local_conf.get("group_name_white_list"),
"nick_name_black_list": local_conf.get("nick_name_black_list"),
"speech_recognition": "Y" if local_conf.get("speech_recognition") else "N",
"text_to_image": local_conf.get("text_to_image"),
"image_create_prefix": local_conf.get("image_create_prefix"),
"model": local_conf.get("model"),
"agent_max_context_turns": local_conf.get("agent_max_context_turns"),
"agent_max_context_tokens": local_conf.get("agent_max_context_tokens"),
"agent_max_steps": local_conf.get("agent_max_steps"),
"channelType": local_conf.get("channel_type"),
}
if local_conf.get("always_reply_voice"):
config["reply_voice_mode"] = "always_reply_voice"
elif local_conf.get("voice_reply_voice"):
config["reply_voice_mode"] = "voice_reply_voice"
if pconf("linkai"):
config["group_app_map"] = pconf("linkai").get("group_app_map")
if plugin_config.get("Godcmd"):
config["admin_password"] = plugin_config.get("Godcmd").get("password")
# Add channel-specific app credentials
current_channel_type = local_conf.get("channel_type", "")
if current_channel_type == "feishu":
config["app_id"] = local_conf.get("feishu_app_id")
config["app_secret"] = local_conf.get("feishu_app_secret")
elif current_channel_type == "dingtalk":
config["app_id"] = local_conf.get("dingtalk_client_id")
config["app_secret"] = local_conf.get("dingtalk_client_secret")
elif current_channel_type in ("wechatmp", "wechatmp_service"):
config["app_id"] = local_conf.get("wechatmp_app_id")
config["app_secret"] = local_conf.get("wechatmp_app_secret")
elif current_channel_type == "wechatcom_app":
config["app_id"] = local_conf.get("wechatcomapp_agent_id")
config["app_secret"] = local_conf.get("wechatcomapp_secret")
return config

View File

@@ -26,8 +26,9 @@ CLAUDE_35_SONNET_1022 = "claude-3-5-sonnet-20241022" # 带具体日期的模型
CLAUDE_35_SONNET_0620 = "claude-3-5-sonnet-20240620"
CLAUDE_4_OPUS = "claude-opus-4-0"
CLAUDE_4_6_OPUS = "claude-opus-4-6" # Claude Opus 4.6 - Agent推荐模型
CLAUDE_4_SONNET = "claude-sonnet-4-0" # Claude Sonnet 4.0 - Agent推荐模型
CLAUDE_4_SONNET = "claude-sonnet-4-0" # Claude Sonnet 4.0
CLAUDE_4_5_SONNET = "claude-sonnet-4-5" # Claude Sonnet 4.5 - Agent推荐模型
CLAUDE_4_6_SONNET = "claude-sonnet-4-6" # Claude Sonnet 4.6 - Agent推荐模型
# Gemini (Google)
GEMINI_PRO = "gemini-1.0-pro"
@@ -35,10 +36,11 @@ GEMINI_15_flash = "gemini-1.5-flash"
GEMINI_15_PRO = "gemini-1.5-pro"
GEMINI_20_flash_exp = "gemini-2.0-flash-exp" # exp结尾为实验模型会逐步不再支持
GEMINI_20_FLASH = "gemini-2.0-flash" # 正式版模型
GEMINI_25_FLASH_PRE = "gemini-2.5-flash-preview-05-20" # preview为预览版模型主要是新能力体验
GEMINI_25_FLASH_PRE = "gemini-2.5-flash-preview-05-20"
GEMINI_25_PRO_PRE = "gemini-2.5-pro-preview-05-06"
GEMINI_3_FLASH_PRE = "gemini-3-flash-preview" # Gemini 3 Flash Preview - Agent推荐模型
GEMINI_3_PRO_PRE = "gemini-3-pro-preview" # Gemini 3 Pro Preview - Agent推荐模型
GEMINI_3_PRO_PRE = "gemini-3-pro-preview" # Gemini 3 Pro Preview
GEMINI_31_PRO_PRE = "gemini-3.1-pro-preview" # Gemini 3.1 Pro Preview - Agent推荐模型
# OpenAI
GPT35 = "gpt-3.5-turbo"
@@ -80,15 +82,18 @@ QWEN_PLUS = "qwen-plus"
QWEN_MAX = "qwen-max"
QWEN_LONG = "qwen-long"
QWEN3_MAX = "qwen3-max" # Qwen3 Max - Agent推荐模型
QWEN35_PLUS = "qwen3.5-plus" # Qwen3.5 Plus - Omni model (MultiModalConversation)
QWQ_PLUS = "qwq-plus"
# MiniMax
MINIMAX_M2_5 = "MiniMax-M2.5" # MiniMax M2.5 - Latest
MINIMAX_M2_1 = "MiniMax-M2.1" # MiniMax M2.1 - Agent推荐模型
MINIMAX_M2_1_LIGHTNING = "MiniMax-M2.1-lightning" # MiniMax M2.1 极速版
MINIMAX_M2 = "MiniMax-M2" # MiniMax M2
MINIMAX_ABAB6_5 = "abab6.5-chat" # MiniMax abab6.5
# GLM (智谱AI)
GLM_5 = "glm-5" # 智谱 GLM-5 - Latest
GLM_4 = "glm-4"
GLM_4_PLUS = "glm-4-plus"
GLM_4_flash = "glm-4-flash"
@@ -101,6 +106,15 @@ GLM_4_7 = "glm-4.7" # 智谱 GLM-4.7 - Agent推荐模型
# Kimi (Moonshot)
MOONSHOT = "moonshot"
KIMI_K2 = "kimi-k2"
KIMI_K2_5 = "kimi-k2.5"
# Doubao (Volcengine Ark)
DOUBAO = "doubao"
DOUBAO_SEED_2_CODE = "doubao-seed-2-0-code-preview-260215"
DOUBAO_SEED_2_PRO = "doubao-seed-2-0-pro-260215"
DOUBAO_SEED_2_LITE = "doubao-seed-2-0-lite-260215"
DOUBAO_SEED_2_MINI = "doubao-seed-2-0-mini-260215"
# 其他模型
WEN_XIN = "wenxin"
@@ -121,12 +135,12 @@ MODELSCOPE_MODEL_LIST = ["LLM-Research/c4ai-command-r-plus-08-2024","mistralai/M
MODEL_LIST = [
# Claude
CLAUDE3, CLAUDE_4_6_OPUS, CLAUDE_4_OPUS, CLAUDE_4_5_SONNET, CLAUDE_4_SONNET, CLAUDE_3_OPUS, CLAUDE_3_OPUS_0229,
CLAUDE3, CLAUDE_4_6_SONNET, CLAUDE_4_6_OPUS, CLAUDE_4_OPUS, CLAUDE_4_5_SONNET, CLAUDE_4_SONNET, CLAUDE_3_OPUS, CLAUDE_3_OPUS_0229,
CLAUDE_35_SONNET, CLAUDE_35_SONNET_1022, CLAUDE_35_SONNET_0620, CLAUDE_3_SONNET, CLAUDE_3_HAIKU,
"claude", "claude-3-haiku", "claude-3-sonnet", "claude-3-opus", "claude-3.5-sonnet",
# Gemini
GEMINI_3_PRO_PRE, GEMINI_3_FLASH_PRE, GEMINI_25_PRO_PRE, GEMINI_25_FLASH_PRE,
GEMINI_31_PRO_PRE, GEMINI_3_PRO_PRE, GEMINI_3_FLASH_PRE, GEMINI_25_PRO_PRE, GEMINI_25_FLASH_PRE,
GEMINI_20_FLASH, GEMINI_20_flash_exp, GEMINI_15_PRO, GEMINI_15_flash, GEMINI_PRO, GEMINI,
# OpenAI
@@ -142,18 +156,22 @@ MODEL_LIST = [
DEEPSEEK_CHAT, DEEPSEEK_REASONER,
# Qwen
QWEN, QWEN_TURBO, QWEN_PLUS, QWEN_MAX, QWEN_LONG, QWEN3_MAX,
QWEN, QWEN_TURBO, QWEN_PLUS, QWEN_MAX, QWEN_LONG, QWEN3_MAX, QWEN35_PLUS,
# MiniMax
MiniMax, MINIMAX_M2_1, MINIMAX_M2_1_LIGHTNING, MINIMAX_M2, MINIMAX_ABAB6_5,
MiniMax, MINIMAX_M2_5, MINIMAX_M2_1, MINIMAX_M2_1_LIGHTNING, MINIMAX_M2, MINIMAX_ABAB6_5,
# GLM
ZHIPU_AI, GLM_4, GLM_4_PLUS, GLM_4_flash, GLM_4_LONG, GLM_4_ALLTOOLS,
ZHIPU_AI, GLM_5, GLM_4, GLM_4_PLUS, GLM_4_flash, GLM_4_LONG, GLM_4_ALLTOOLS,
GLM_4_0520, GLM_4_AIR, GLM_4_AIRX, GLM_4_7,
# Kimi
MOONSHOT, "moonshot-v1-8k", "moonshot-v1-32k", "moonshot-v1-128k",
KIMI_K2, KIMI_K2_5,
# Doubao
DOUBAO, DOUBAO_SEED_2_CODE, DOUBAO_SEED_2_PRO, DOUBAO_SEED_2_LITE, DOUBAO_SEED_2_MINI,
# 其他模型
WEN_XIN, WEN_XIN_4, XUNFEI,
LINKAI_35, LINKAI_4_TURBO, LINKAI_4o,

View File

@@ -1,110 +0,0 @@
from bridge.context import Context, ContextType
from bridge.reply import Reply, ReplyType
from common.log import logger
from linkai import LinkAIClient, PushMsg
from config import conf, pconf, plugin_config, available_setting, write_plugin_config
from plugins import PluginManager
import time
chat_client: LinkAIClient
class ChatClient(LinkAIClient):
def __init__(self, api_key, host, channel):
super().__init__(api_key, host)
self.channel = channel
self.client_type = channel.channel_type
def on_message(self, push_msg: PushMsg):
session_id = push_msg.session_id
msg_content = push_msg.msg_content
logger.info(f"receive msg push, session_id={session_id}, msg_content={msg_content}")
context = Context()
context.type = ContextType.TEXT
context["receiver"] = session_id
context["isgroup"] = push_msg.is_group
self.channel.send(Reply(ReplyType.TEXT, content=msg_content), context)
def on_config(self, config: dict):
if not self.client_id:
return
logger.info(f"[LinkAI] 从客户端管理加载远程配置: {config}")
if config.get("enabled") != "Y":
return
local_config = conf()
for key in config.keys():
if key in available_setting and config.get(key) is not None:
local_config[key] = config.get(key)
# 语音配置
reply_voice_mode = config.get("reply_voice_mode")
if reply_voice_mode:
if reply_voice_mode == "voice_reply_voice":
local_config["voice_reply_voice"] = True
local_config["always_reply_voice"] = False
elif reply_voice_mode == "always_reply_voice":
local_config["always_reply_voice"] = True
local_config["voice_reply_voice"] = True
elif reply_voice_mode == "no_reply_voice":
local_config["always_reply_voice"] = False
local_config["voice_reply_voice"] = False
if config.get("admin_password"):
if not pconf("Godcmd"):
write_plugin_config({"Godcmd": {"password": config.get("admin_password"), "admin_users": []} })
else:
pconf("Godcmd")["password"] = config.get("admin_password")
PluginManager().instances["GODCMD"].reload()
if config.get("group_app_map") and pconf("linkai"):
local_group_map = {}
for mapping in config.get("group_app_map"):
local_group_map[mapping.get("group_name")] = mapping.get("app_code")
pconf("linkai")["group_app_map"] = local_group_map
PluginManager().instances["LINKAI"].reload()
if config.get("text_to_image") and config.get("text_to_image") == "midjourney" and pconf("linkai"):
if pconf("linkai")["midjourney"]:
pconf("linkai")["midjourney"]["enabled"] = True
pconf("linkai")["midjourney"]["use_image_create_prefix"] = True
elif config.get("text_to_image") and config.get("text_to_image") in ["dall-e-2", "dall-e-3"]:
if pconf("linkai")["midjourney"]:
pconf("linkai")["midjourney"]["use_image_create_prefix"] = False
def start(channel):
global chat_client
chat_client = ChatClient(api_key=conf().get("linkai_api_key"), host="", channel=channel)
chat_client.config = _build_config()
chat_client.start()
time.sleep(1.5)
if chat_client.client_id:
logger.info("[LinkAI] 可前往控制台进行线上登录和配置https://link-ai.tech/console/clients")
def _build_config():
local_conf = conf()
config = {
"linkai_app_code": local_conf.get("linkai_app_code"),
"single_chat_prefix": local_conf.get("single_chat_prefix"),
"single_chat_reply_prefix": local_conf.get("single_chat_reply_prefix"),
"single_chat_reply_suffix": local_conf.get("single_chat_reply_suffix"),
"group_chat_prefix": local_conf.get("group_chat_prefix"),
"group_chat_reply_prefix": local_conf.get("group_chat_reply_prefix"),
"group_chat_reply_suffix": local_conf.get("group_chat_reply_suffix"),
"group_name_white_list": local_conf.get("group_name_white_list"),
"nick_name_black_list": local_conf.get("nick_name_black_list"),
"speech_recognition": "Y" if local_conf.get("speech_recognition") else "N",
"text_to_image": local_conf.get("text_to_image"),
"image_create_prefix": local_conf.get("image_create_prefix")
}
if local_conf.get("always_reply_voice"):
config["reply_voice_mode"] = "always_reply_voice"
elif local_conf.get("voice_reply_voice"):
config["reply_voice_mode"] = "voice_reply_voice"
if pconf("linkai"):
config["group_app_map"] = pconf("linkai").get("group_app_map")
if plugin_config.get("Godcmd"):
config["admin_password"] = plugin_config.get("Godcmd").get("password")
return config

View File

@@ -2,7 +2,6 @@ import io
import os
import re
from urllib.parse import urlparse
from PIL import Image
from common.log import logger
def fsize(file):
@@ -23,6 +22,7 @@ def fsize(file):
def compress_imgfile(file, max_size):
if fsize(file) <= max_size:
return file
from PIL import Image
file.seek(0)
img = Image.open(file)
rgb_image = img.convert("RGB")

View File

@@ -1,15 +1,17 @@
{
"channel_type": "web",
"model": "glm-4.7",
"model": "MiniMax-M2.5",
"minimax_api_key": "",
"zhipu_ai_api_key": "",
"ark_api_key": "",
"moonshot_api_key": "",
"dashscope_api_key": "",
"claude_api_key": "",
"claude_api_base": "https://api.anthropic.com/v1",
"open_ai_api_key": "",
"open_ai_api_base": "https://api.openai.com/v1",
"gemini_api_key": "",
"gemini_api_base": "https://generativelanguage.googleapis.com",
"zhipu_ai_api_key": "",
"minimax_api_key": "",
"dashscope_api_key": "",
"voice_to_text": "openai",
"text_to_voice": "openai",
"voice_reply_voice": false,

View File

@@ -174,7 +174,10 @@ available_setting = {
"zhipu_ai_api_key": "",
"zhipu_ai_api_base": "https://open.bigmodel.cn/api/paas/v4",
"moonshot_api_key": "",
"moonshot_base_url": "https://api.moonshot.cn/v1/chat/completions",
"moonshot_base_url": "https://api.moonshot.cn/v1",
# 豆包(火山方舟) 平台配置
"ark_api_key": "",
"ark_base_url": "https://ark.cn-beijing.volces.com/api/v3",
#魔搭社区 平台配置
"modelscope_api_key": "",
"modelscope_base_url": "https://api-inference.modelscope.cn/v1/chat/completions",

View File

@@ -8,7 +8,7 @@ Cow项目从简单的聊天机器人全面升级为超级智能助理 **CowAgent
- **工具系统**内置实现10+种工具包括文件读写、bash终端、浏览器、定时任务、记忆管理等通过Agent管理你的计算机或服务器
- **长期记忆**:自动将对话记忆持久化至本地文件和数据库中,包括全局记忆和天级记忆,支持关键词及向量检索
- **Skills系统**新增Skill运行引擎内置多种技能并支持通过自然语言对话完成自定义Skills开发
- **多渠道和多模型支持**支持在Web、飞书、钉钉、企微等多渠道与Agent交互支持Claude、Gemini、OpenAI、GLM、MiniMax、Qwen 等多种国内外主流模型
- **多渠道和多模型支持**支持在Web、飞书、钉钉、企微等多渠道与Agent交互支持Claude、Gemini、OpenAI、GLM、MiniMax、Qwen、Kimi、Doubao 等多种国内外主流模型
- **安全和成本**通过秘钥管理工具、提示词控制、系统权限等手段控制Agent的访问安全通过最大记忆轮次、最大上下文token、工具执行步数对token成本进行限制
@@ -137,11 +137,13 @@ bash <(curl -sS https://cdn.link-ai.tech/code/cow/run.sh)
Agent模式推荐使用以下模型可根据效果及成本综合选择
- **MiniMax**: `MiniMax-M2.1`
- **GLM**: `glm-4.7`
- **Qwen**: `qwen3-max`
- **Claude**: `claude-sonnet-4-5``claude-sonnet-4-0`
- **Gemini**: `gemini-3-flash-preview``gemini-3-pro-preview`
- **MiniMax**: `MiniMax-M2.5`
- **GLM**: `glm-5`
- **Kimi**: `kimi-k2.5`
- **Doubao**: `doubao-seed-2-0-code-preview-260215`
- **Qwen**: `qwen3.5-plus`
- **Claude**: `claude-sonnet-4-6`
- **Gemini**: `gemini-3.1-pro-preview`
详细模型配置方式参考 [README.md 模型说明](../README.md#模型说明)

View File

@@ -69,5 +69,8 @@ def create_bot(bot_type):
from models.modelscope.modelscope_bot import ModelScopeBot
return ModelScopeBot()
elif bot_type == const.DOUBAO:
from models.doubao.doubao_bot import DoubaoBot
return DoubaoBot()
raise RuntimeError

View File

@@ -10,25 +10,26 @@ from config import conf, load_config
from .dashscope_session import DashscopeSession
import os
import dashscope
from dashscope import MultiModalConversation
from http import HTTPStatus
# Legacy model name mapping for older dashscope SDK constants.
# New models don't need to be added here — they use their name string directly.
dashscope_models = {
"qwen-turbo": dashscope.Generation.Models.qwen_turbo,
"qwen-plus": dashscope.Generation.Models.qwen_plus,
"qwen-max": dashscope.Generation.Models.qwen_max,
"qwen-bailian-v1": dashscope.Generation.Models.bailian_v1,
# Qwen3 series models - use string directly as model name
"qwen3-max": "qwen3-max",
"qwen3-plus": "qwen3-plus",
"qwen3-turbo": "qwen3-turbo",
# Other new models
"qwen-long": "qwen-long",
"qwq-32b-preview": "qwq-32b-preview",
"qvq-72b-preview": "qvq-72b-preview"
}
# ZhipuAI对话模型API
# Model name prefixes that require MultiModalConversation API instead of Generation API.
# Qwen3.5+ series are omni models that only support MultiModalConversation.
MULTIMODAL_MODEL_PREFIXES = ("qwen3.5-",)
# Qwen对话模型API
class DashscopeBot(Bot):
def __init__(self):
super().__init__()
@@ -39,6 +40,11 @@ class DashscopeBot(Bot):
os.environ["DASHSCOPE_API_KEY"] = self.api_key
self.client = dashscope.Generation
@staticmethod
def _is_multimodal_model(model_name: str) -> bool:
"""Check if the model requires MultiModalConversation API"""
return model_name.startswith(MULTIMODAL_MODEL_PREFIXES)
def reply(self, query, context=None):
# acquire reply content
if context.type == ContextType.TEXT:
@@ -93,16 +99,33 @@ class DashscopeBot(Bot):
"""
try:
dashscope.api_key = self.api_key
response = self.client.call(
dashscope_models[self.model_name],
messages=session.messages,
result_format="message"
)
model = dashscope_models.get(self.model_name, self.model_name)
if self._is_multimodal_model(self.model_name):
mm_messages = self._prepare_messages_for_multimodal(session.messages)
response = MultiModalConversation.call(
model=model,
messages=mm_messages,
result_format="message"
)
else:
response = self.client.call(
model,
messages=session.messages,
result_format="message"
)
if response.status_code == HTTPStatus.OK:
content = response.output.choices[0]["message"]["content"]
resp_dict = self._response_to_dict(response)
choice = resp_dict["output"]["choices"][0]
content = choice.get("message", {}).get("content", "")
# Multimodal models may return content as a list of blocks
if isinstance(content, list):
content = "".join(
item.get("text", "") for item in content if isinstance(item, dict)
)
usage = resp_dict.get("usage", {})
return {
"total_tokens": response.usage["total_tokens"],
"completion_tokens": response.usage["output_tokens"],
"total_tokens": usage.get("total_tokens", 0),
"completion_tokens": usage.get("output_tokens", 0),
"content": content,
}
else:
@@ -232,36 +255,54 @@ class DashscopeBot(Bot):
try:
# Set API key before calling
dashscope.api_key = self.api_key
response = dashscope.Generation.call(
model=dashscope_models.get(model_name, model_name),
messages=messages,
**parameters
)
model = dashscope_models.get(model_name, model_name)
if self._is_multimodal_model(model_name):
messages = self._prepare_messages_for_multimodal(messages)
response = MultiModalConversation.call(
model=model,
messages=messages,
**parameters
)
else:
response = dashscope.Generation.call(
model=model,
messages=messages,
**parameters
)
if response.status_code == HTTPStatus.OK:
# Convert DashScope response to OpenAI-compatible format
choice = response.output.choices[0]
# Convert response to dict to avoid DashScope object KeyError issues
resp_dict = self._response_to_dict(response)
choice = resp_dict["output"]["choices"][0]
message = choice.get("message", {})
content = message.get("content", "")
# Multimodal models may return content as a list of blocks
if isinstance(content, list):
content = "".join(
item.get("text", "") for item in content if isinstance(item, dict)
)
usage = resp_dict.get("usage", {})
return {
"id": response.request_id,
"id": resp_dict.get("request_id"),
"object": "chat.completion",
"created": 0,
"model": model_name,
"choices": [{
"index": 0,
"message": {
"role": choice.message.role,
"content": choice.message.content,
"role": message.get("role", "assistant"),
"content": content,
"tool_calls": self._convert_tool_calls_to_openai_format(
choice.message.get("tool_calls")
message.get("tool_calls")
)
},
"finish_reason": choice.finish_reason
"finish_reason": choice.get("finish_reason")
}],
"usage": {
"prompt_tokens": response.usage.input_tokens,
"completion_tokens": response.usage.output_tokens,
"total_tokens": response.usage.total_tokens
"prompt_tokens": usage.get("input_tokens", 0),
"completion_tokens": usage.get("output_tokens", 0),
"total_tokens": usage.get("total_tokens", 0)
}
}
else:
@@ -271,7 +312,7 @@ class DashscopeBot(Bot):
"message": response.message,
"status_code": response.status_code
}
except Exception as e:
logger.error(f"[DASHSCOPE] sync response error: {e}")
return {
@@ -285,48 +326,52 @@ class DashscopeBot(Bot):
try:
# Set API key before calling
dashscope.api_key = self.api_key
responses = dashscope.Generation.call(
model=dashscope_models.get(model_name, model_name),
messages=messages,
stream=True,
**parameters
)
model = dashscope_models.get(model_name, model_name)
if self._is_multimodal_model(model_name):
messages = self._prepare_messages_for_multimodal(messages)
responses = MultiModalConversation.call(
model=model,
messages=messages,
stream=True,
**parameters
)
else:
responses = dashscope.Generation.call(
model=model,
messages=messages,
stream=True,
**parameters
)
# Stream chunks to caller, converting to OpenAI format
for response in responses:
if response.status_code != HTTPStatus.OK:
logger.error(f"[DASHSCOPE] Stream error: {response.code} - {response.message}")
# Convert to dict first to avoid DashScope proxy object KeyError
resp_dict = self._response_to_dict(response)
status_code = resp_dict.get("status_code", 200)
if status_code != HTTPStatus.OK:
err_code = resp_dict.get("code", "")
err_msg = resp_dict.get("message", "Unknown error")
logger.error(f"[DASHSCOPE] Stream error: {err_code} - {err_msg}")
yield {
"error": True,
"message": response.message,
"status_code": response.status_code
"message": err_msg,
"status_code": status_code
}
continue
# Get choice - use try-except because DashScope raises KeyError on hasattr()
try:
if isinstance(response.output, dict):
choice = response.output['choices'][0]
else:
choice = response.output.choices[0]
except (KeyError, AttributeError, IndexError) as e:
logger.warning(f"[DASHSCOPE] Cannot get choice: {e}")
choices = resp_dict.get("output", {}).get("choices", [])
if not choices:
continue
# Get finish_reason safely
finish_reason = None
try:
if isinstance(choice, dict):
finish_reason = choice.get('finish_reason')
else:
finish_reason = choice.finish_reason
except (KeyError, AttributeError):
pass
choice = choices[0]
finish_reason = choice.get("finish_reason")
message = choice.get("message", {})
# Convert to OpenAI-compatible format
openai_chunk = {
"id": response.request_id,
"id": resp_dict.get("request_id"),
"object": "chat.completion.chunk",
"created": 0,
"model": model_name,
@@ -336,66 +381,90 @@ class DashscopeBot(Bot):
"finish_reason": finish_reason
}]
}
# Get message safely - use try-except
message = {}
try:
if isinstance(choice, dict):
message = choice.get('message', {})
else:
message = choice.message
except (KeyError, AttributeError):
pass
# Add role if present
role = None
try:
if isinstance(message, dict):
role = message.get('role')
else:
role = message.role
except (KeyError, AttributeError):
pass
# Add role
role = message.get("role")
if role:
openai_chunk["choices"][0]["delta"]["role"] = role
# Add content if present
content = None
try:
if isinstance(message, dict):
content = message.get('content')
else:
content = message.content
except (KeyError, AttributeError):
pass
# Add reasoning_content (thinking process from models like qwen3.5)
reasoning_content = message.get("reasoning_content")
if reasoning_content:
openai_chunk["choices"][0]["delta"]["reasoning_content"] = reasoning_content
# Add content (multimodal models may return list of blocks)
content = message.get("content")
if isinstance(content, list):
content = "".join(
item.get("text", "") for item in content if isinstance(item, dict)
)
if content:
openai_chunk["choices"][0]["delta"]["content"] = content
# Add tool_calls if present
# DashScope's response object raises KeyError on hasattr() if attr doesn't exist
# So we use try-except instead
tool_calls = None
try:
if isinstance(message, dict):
tool_calls = message.get('tool_calls')
else:
tool_calls = message.tool_calls
except (KeyError, AttributeError):
pass
# Add tool_calls
tool_calls = message.get("tool_calls")
if tool_calls:
openai_chunk["choices"][0]["delta"]["tool_calls"] = self._convert_tool_calls_to_openai_format(tool_calls)
yield openai_chunk
except Exception as e:
logger.error(f"[DASHSCOPE] stream response error: {e}")
logger.error(f"[DASHSCOPE] stream response error: {e}", exc_info=True)
yield {
"error": True,
"message": str(e),
"status_code": 500
}
@staticmethod
def _response_to_dict(response) -> dict:
"""
Convert DashScope response object to a plain dict.
DashScope SDK wraps responses in proxy objects whose __getattr__
delegates to __getitem__, raising KeyError (not AttributeError)
when an attribute is missing. Standard hasattr / getattr only
catch AttributeError, so we must use try-except everywhere.
"""
_SENTINEL = object()
def _safe_getattr(obj, name, default=_SENTINEL):
"""getattr that also catches KeyError from DashScope proxy objects."""
try:
return getattr(obj, name)
except (AttributeError, KeyError, TypeError):
return default
def _has_attr(obj, name):
return _safe_getattr(obj, name) is not _SENTINEL
def _to_dict(obj):
if isinstance(obj, (str, int, float, bool, type(None))):
return obj
if isinstance(obj, dict):
return {k: _to_dict(v) for k, v in obj.items()}
if isinstance(obj, (list, tuple)):
return [_to_dict(i) for i in obj]
# DashScope response objects behave like dicts (have .keys())
if _has_attr(obj, "keys"):
try:
return {k: _to_dict(obj[k]) for k in obj.keys()}
except Exception:
pass
return obj
result = {}
# Extract known top-level fields safely
for attr in ("request_id", "status_code", "code", "message", "output", "usage"):
val = _safe_getattr(response, attr)
if val is _SENTINEL:
try:
val = response[attr]
except (KeyError, TypeError, IndexError):
continue
result[attr] = _to_dict(val)
return result
def _convert_tools_to_dashscope_format(self, tools):
"""
Convert tools from Claude format to DashScope format
@@ -424,6 +493,37 @@ class DashscopeBot(Bot):
return dashscope_tools
@staticmethod
def _prepare_messages_for_multimodal(messages: list) -> list:
"""
Ensure messages are compatible with MultiModalConversation API.
MultiModalConversation._preprocess_messages iterates every message
with ``content = message["content"]; for elem in content: ...``,
which means:
1. Every message MUST have a 'content' key.
2. 'content' MUST be an iterable (list), not a plain string.
The expected format is [{"text": "..."}, ...].
Meanwhile the DashScope API requires role='tool' messages to follow
assistant tool_calls, so we must NOT convert them to role='user'.
We just ensure they have a list-typed 'content'.
"""
result = []
for msg in messages:
msg = dict(msg) # shallow copy
# Normalize content to list format [{"text": "..."}]
content = msg.get("content")
if content is None or (isinstance(content, str) and content == ""):
msg["content"] = [{"text": ""}]
elif isinstance(content, str):
msg["content"] = [{"text": content}]
# If content is already a list, keep as-is (already in multimodal format)
result.append(msg)
return result
def _convert_messages_to_dashscope_format(self, messages):
"""
Convert messages from Claude format to DashScope format

View File

520
models/doubao/doubao_bot.py Normal file
View File

@@ -0,0 +1,520 @@
# encoding:utf-8
import json
import time
import requests
from models.bot import Bot
from models.session_manager import SessionManager
from bridge.context import ContextType
from bridge.reply import Reply, ReplyType
from common.log import logger
from config import conf, load_config
from .doubao_session import DoubaoSession
# Doubao (火山方舟 / Volcengine Ark) API Bot
class DoubaoBot(Bot):
def __init__(self):
super().__init__()
self.sessions = SessionManager(DoubaoSession, model=conf().get("model") or "doubao-seed-2-0-pro-260215")
model = conf().get("model") or "doubao-seed-2-0-pro-260215"
self.args = {
"model": model,
"temperature": conf().get("temperature", 0.8),
"top_p": conf().get("top_p", 1.0),
}
self.api_key = conf().get("ark_api_key")
self.base_url = conf().get("ark_base_url", "https://ark.cn-beijing.volces.com/api/v3")
# Ensure base_url does not end with /chat/completions
if self.base_url.endswith("/chat/completions"):
self.base_url = self.base_url.rsplit("/chat/completions", 1)[0]
if self.base_url.endswith("/"):
self.base_url = self.base_url.rstrip("/")
def reply(self, query, context=None):
# acquire reply content
if context.type == ContextType.TEXT:
logger.info("[DOUBAO] query={}".format(query))
session_id = context["session_id"]
reply = None
clear_memory_commands = conf().get("clear_memory_commands", ["#清除记忆"])
if query in clear_memory_commands:
self.sessions.clear_session(session_id)
reply = Reply(ReplyType.INFO, "记忆已清除")
elif query == "#清除所有":
self.sessions.clear_all_session()
reply = Reply(ReplyType.INFO, "所有人记忆已清除")
elif query == "#更新配置":
load_config()
reply = Reply(ReplyType.INFO, "配置已更新")
if reply:
return reply
session = self.sessions.session_query(query, session_id)
logger.debug("[DOUBAO] session query={}".format(session.messages))
model = context.get("doubao_model")
new_args = self.args.copy()
if model:
new_args["model"] = model
reply_content = self.reply_text(session, args=new_args)
logger.debug(
"[DOUBAO] new_query={}, session_id={}, reply_cont={}, completion_tokens={}".format(
session.messages,
session_id,
reply_content["content"],
reply_content["completion_tokens"],
)
)
if reply_content["completion_tokens"] == 0 and len(reply_content["content"]) > 0:
reply = Reply(ReplyType.ERROR, reply_content["content"])
elif reply_content["completion_tokens"] > 0:
self.sessions.session_reply(reply_content["content"], session_id, reply_content["total_tokens"])
reply = Reply(ReplyType.TEXT, reply_content["content"])
else:
reply = Reply(ReplyType.ERROR, reply_content["content"])
logger.debug("[DOUBAO] reply {} used 0 tokens.".format(reply_content))
return reply
else:
reply = Reply(ReplyType.ERROR, "Bot不支持处理{}类型的消息".format(context.type))
return reply
def reply_text(self, session: DoubaoSession, args=None, retry_count: int = 0) -> dict:
"""
Call Doubao chat completion API to get the answer
:param session: a conversation session
:param args: model args
:param retry_count: retry count
:return: {}
"""
try:
headers = {
"Content-Type": "application/json",
"Authorization": "Bearer " + self.api_key
}
body = args.copy()
body["messages"] = session.messages
# Disable thinking by default for better efficiency
body["thinking"] = {"type": "disabled"}
res = requests.post(
f"{self.base_url}/chat/completions",
headers=headers,
json=body
)
if res.status_code == 200:
response = res.json()
return {
"total_tokens": response["usage"]["total_tokens"],
"completion_tokens": response["usage"]["completion_tokens"],
"content": response["choices"][0]["message"]["content"]
}
else:
response = res.json()
error = response.get("error", {})
logger.error(f"[DOUBAO] chat failed, status_code={res.status_code}, "
f"msg={error.get('message')}, type={error.get('type')}")
result = {"completion_tokens": 0, "content": "提问太快啦,请休息一下再问我吧"}
need_retry = False
if res.status_code >= 500:
logger.warn(f"[DOUBAO] do retry, times={retry_count}")
need_retry = retry_count < 2
elif res.status_code == 401:
result["content"] = "授权失败请检查API Key是否正确"
elif res.status_code == 429:
result["content"] = "请求过于频繁,请稍后再试"
need_retry = retry_count < 2
else:
need_retry = False
if need_retry:
time.sleep(3)
return self.reply_text(session, args, retry_count + 1)
else:
return result
except Exception as e:
logger.exception(e)
need_retry = retry_count < 2
result = {"completion_tokens": 0, "content": "我现在有点累了,等会再来吧"}
if need_retry:
return self.reply_text(session, args, retry_count + 1)
else:
return result
# ==================== Agent mode support ====================
def call_with_tools(self, messages, tools=None, stream: bool = False, **kwargs):
"""
Call Doubao API with tool support for agent integration.
This method handles:
1. Format conversion (Claude format -> OpenAI format)
2. System prompt injection
3. Streaming SSE response with tool_calls
4. Thinking (reasoning) is disabled by default for efficiency
Args:
messages: List of messages (may be in Claude format from agent)
tools: List of tool definitions (may be in Claude format from agent)
stream: Whether to use streaming
**kwargs: Additional parameters (max_tokens, temperature, system, model, etc.)
Returns:
Generator yielding OpenAI-format chunks (for streaming)
"""
try:
# Convert messages from Claude format to OpenAI format
converted_messages = self._convert_messages_to_openai_format(messages)
# Inject system prompt if provided
system_prompt = kwargs.pop("system", None)
if system_prompt:
if not converted_messages or converted_messages[0].get("role") != "system":
converted_messages.insert(0, {"role": "system", "content": system_prompt})
else:
converted_messages[0] = {"role": "system", "content": system_prompt}
# Convert tools from Claude format to OpenAI format
converted_tools = None
if tools:
converted_tools = self._convert_tools_to_openai_format(tools)
# Resolve model / temperature
model = kwargs.pop("model", None) or self.args["model"]
max_tokens = kwargs.pop("max_tokens", None)
# Don't pop temperature, just ignore it - let API use default
kwargs.pop("temperature", None)
# Build request body (omit temperature, let the API use its own default)
request_body = {
"model": model,
"messages": converted_messages,
"stream": stream,
}
if max_tokens is not None:
request_body["max_tokens"] = max_tokens
# Add tools
if converted_tools:
request_body["tools"] = converted_tools
request_body["tool_choice"] = "auto"
# Explicitly disable thinking to avoid reasoning_content issues
# in multi-turn tool calls
request_body["thinking"] = {"type": "disabled"}
logger.debug(f"[DOUBAO] API call: model={model}, "
f"tools={len(converted_tools) if converted_tools else 0}, stream={stream}")
if stream:
return self._handle_stream_response(request_body)
else:
return self._handle_sync_response(request_body)
except Exception as e:
logger.error(f"[DOUBAO] call_with_tools error: {e}")
import traceback
logger.error(traceback.format_exc())
def error_generator():
yield {"error": True, "message": str(e), "status_code": 500}
return error_generator()
# -------------------- streaming --------------------
def _handle_stream_response(self, request_body: dict):
"""Handle streaming SSE response from Doubao API and yield OpenAI-format chunks."""
try:
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {self.api_key}"
}
url = f"{self.base_url}/chat/completions"
response = requests.post(url, headers=headers, json=request_body, stream=True, timeout=120)
if response.status_code != 200:
error_msg = response.text
logger.error(f"[DOUBAO] API error: status={response.status_code}, msg={error_msg}")
yield {"error": True, "message": error_msg, "status_code": response.status_code}
return
current_tool_calls = {}
finish_reason = None
for line in response.iter_lines():
if not line:
continue
line = line.decode("utf-8")
if not line.startswith("data: "):
continue
data_str = line[6:] # Remove "data: " prefix
if data_str.strip() == "[DONE]":
break
try:
chunk = json.loads(data_str)
except json.JSONDecodeError as e:
logger.warning(f"[DOUBAO] JSON decode error: {e}, data: {data_str[:200]}")
continue
# Check for error in chunk
if chunk.get("error"):
error_data = chunk["error"]
error_msg = error_data.get("message", "Unknown error") if isinstance(error_data, dict) else str(error_data)
logger.error(f"[DOUBAO] stream error: {error_msg}")
yield {"error": True, "message": error_msg, "status_code": 500}
return
if not chunk.get("choices"):
continue
choice = chunk["choices"][0]
delta = choice.get("delta", {})
# Skip reasoning_content (thinking) - don't log or forward
if delta.get("reasoning_content"):
continue
# Handle text content
if "content" in delta and delta["content"]:
yield {
"choices": [{
"index": 0,
"delta": {
"role": "assistant",
"content": delta["content"]
}
}]
}
# Handle tool_calls (streamed incrementally)
if "tool_calls" in delta:
for tool_call_chunk in delta["tool_calls"]:
index = tool_call_chunk.get("index", 0)
if index not in current_tool_calls:
current_tool_calls[index] = {
"id": tool_call_chunk.get("id", ""),
"type": "tool_use",
"name": tool_call_chunk.get("function", {}).get("name", ""),
"input": ""
}
# Accumulate arguments
if "function" in tool_call_chunk and "arguments" in tool_call_chunk["function"]:
current_tool_calls[index]["input"] += tool_call_chunk["function"]["arguments"]
# Yield OpenAI-format tool call delta
yield {
"choices": [{
"index": 0,
"delta": {
"tool_calls": [tool_call_chunk]
}
}]
}
# Capture finish_reason
if choice.get("finish_reason"):
finish_reason = choice["finish_reason"]
# Final chunk with finish_reason
yield {
"choices": [{
"index": 0,
"delta": {},
"finish_reason": finish_reason
}]
}
except requests.exceptions.Timeout:
logger.error("[DOUBAO] Request timeout")
yield {"error": True, "message": "Request timeout", "status_code": 500}
except Exception as e:
logger.error(f"[DOUBAO] stream response error: {e}")
import traceback
logger.error(traceback.format_exc())
yield {"error": True, "message": str(e), "status_code": 500}
# -------------------- sync --------------------
def _handle_sync_response(self, request_body: dict):
"""Handle synchronous API response and yield a single result dict."""
try:
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {self.api_key}"
}
request_body.pop("stream", None)
url = f"{self.base_url}/chat/completions"
response = requests.post(url, headers=headers, json=request_body, timeout=120)
if response.status_code != 200:
error_msg = response.text
logger.error(f"[DOUBAO] API error: status={response.status_code}, msg={error_msg}")
yield {"error": True, "message": error_msg, "status_code": response.status_code}
return
result = response.json()
message = result["choices"][0]["message"]
finish_reason = result["choices"][0]["finish_reason"]
response_data = {"role": "assistant", "content": []}
# Add text content
if message.get("content"):
response_data["content"].append({
"type": "text",
"text": message["content"]
})
# Add tool calls
if message.get("tool_calls"):
for tool_call in message["tool_calls"]:
response_data["content"].append({
"type": "tool_use",
"id": tool_call["id"],
"name": tool_call["function"]["name"],
"input": json.loads(tool_call["function"]["arguments"])
})
# Map finish_reason
if finish_reason == "tool_calls":
response_data["stop_reason"] = "tool_use"
elif finish_reason == "stop":
response_data["stop_reason"] = "end_turn"
else:
response_data["stop_reason"] = finish_reason
yield response_data
except requests.exceptions.Timeout:
logger.error("[DOUBAO] Request timeout")
yield {"error": True, "message": "Request timeout", "status_code": 500}
except Exception as e:
logger.error(f"[DOUBAO] sync response error: {e}")
import traceback
logger.error(traceback.format_exc())
yield {"error": True, "message": str(e), "status_code": 500}
# -------------------- format conversion --------------------
def _convert_messages_to_openai_format(self, messages):
"""
Convert messages from Claude format to OpenAI format.
Claude format uses content blocks: tool_use / tool_result / text
OpenAI format uses tool_calls in assistant, role=tool for results
"""
if not messages:
return []
converted = []
for msg in messages:
role = msg.get("role")
content = msg.get("content")
# Already a simple string - pass through
if isinstance(content, str):
converted.append(msg)
continue
if not isinstance(content, list):
converted.append(msg)
continue
if role == "user":
text_parts = []
tool_results = []
for block in content:
if not isinstance(block, dict):
continue
if block.get("type") == "text":
text_parts.append(block.get("text", ""))
elif block.get("type") == "tool_result":
tool_call_id = block.get("tool_use_id") or ""
result_content = block.get("content", "")
if not isinstance(result_content, str):
result_content = json.dumps(result_content, ensure_ascii=False)
tool_results.append({
"role": "tool",
"tool_call_id": tool_call_id,
"content": result_content
})
# Tool results first (must come right after assistant with tool_calls)
for tr in tool_results:
converted.append(tr)
if text_parts:
converted.append({"role": "user", "content": "\n".join(text_parts)})
elif role == "assistant":
openai_msg = {"role": "assistant"}
text_parts = []
tool_calls = []
for block in content:
if not isinstance(block, dict):
continue
if block.get("type") == "text":
text_parts.append(block.get("text", ""))
elif block.get("type") == "tool_use":
tool_calls.append({
"id": block.get("id"),
"type": "function",
"function": {
"name": block.get("name"),
"arguments": json.dumps(block.get("input", {}))
}
})
if text_parts:
openai_msg["content"] = "\n".join(text_parts)
elif not tool_calls:
openai_msg["content"] = ""
if tool_calls:
openai_msg["tool_calls"] = tool_calls
if not text_parts:
openai_msg["content"] = None
converted.append(openai_msg)
else:
converted.append(msg)
return converted
def _convert_tools_to_openai_format(self, tools):
"""
Convert tools from Claude format to OpenAI format.
Claude: {name, description, input_schema}
OpenAI: {type: "function", function: {name, description, parameters}}
"""
if not tools:
return None
converted = []
for tool in tools:
# Already in OpenAI format
if "type" in tool and tool["type"] == "function":
converted.append(tool)
else:
converted.append({
"type": "function",
"function": {
"name": tool.get("name"),
"description": tool.get("description"),
"parameters": tool.get("input_schema", {})
}
})
return converted

View File

@@ -0,0 +1,51 @@
from models.session_manager import Session
from common.log import logger
class DoubaoSession(Session):
def __init__(self, session_id, system_prompt=None, model="doubao-seed-2-0-pro-260215"):
super().__init__(session_id, system_prompt)
self.model = model
self.reset()
def discard_exceeding(self, max_tokens, cur_tokens=None):
precise = True
try:
cur_tokens = self.calc_tokens()
except Exception as e:
precise = False
if cur_tokens is None:
raise e
logger.debug("Exception when counting tokens precisely for query: {}".format(e))
while cur_tokens > max_tokens:
if len(self.messages) > 2:
self.messages.pop(1)
elif len(self.messages) == 2 and self.messages[1]["role"] == "assistant":
self.messages.pop(1)
if precise:
cur_tokens = self.calc_tokens()
else:
cur_tokens = cur_tokens - max_tokens
break
elif len(self.messages) == 2 and self.messages[1]["role"] == "user":
logger.warn("user message exceed max_tokens. total_tokens={}".format(cur_tokens))
break
else:
logger.debug("max_tokens={}, total_tokens={}, len(messages)={}".format(
max_tokens, cur_tokens, len(self.messages)))
break
if precise:
cur_tokens = self.calc_tokens()
else:
cur_tokens = cur_tokens - max_tokens
return cur_tokens
def calc_tokens(self):
return num_tokens_from_messages(self.messages, self.model)
def num_tokens_from_messages(messages, model):
tokens = 0
for msg in messages:
tokens += len(msg["content"])
return tokens

View File

@@ -6,11 +6,14 @@ Google gemini bot
"""
# encoding:utf-8
import base64
import json
import mimetypes
import os
import re
import time
import requests
from models.bot import Bot
import google.generativeai as genai
from models.session_manager import SessionManager
from bridge.context import ContextType, Context
from bridge.reply import Reply, ReplyType
@@ -18,7 +21,6 @@ from common.log import logger
from config import conf
from models.chatgpt.chat_gpt_session import ChatGPTSession
from models.baidu.baidu_wenxin_session import BaiduWenxinSession
from google.generativeai.types import HarmCategory, HarmBlockThreshold
# OpenAI对话模型API (可用)
@@ -43,6 +45,7 @@ class GoogleGeminiBot(Bot):
self.api_base = "https://generativelanguage.googleapis.com"
def reply(self, query, context: Context = None) -> Reply:
session_id = None
try:
if context.type != ContextType.TEXT:
logger.warn(f"[Gemini] Unsupported message type, type={context.type}")
@@ -50,43 +53,47 @@ class GoogleGeminiBot(Bot):
logger.info(f"[Gemini] query={query}")
session_id = context["session_id"]
session = self.sessions.session_query(query, session_id)
gemini_messages = self._convert_to_gemini_messages(self.filter_messages(session.messages))
logger.debug(f"[Gemini] messages={gemini_messages}")
genai.configure(api_key=self.api_key)
model = genai.GenerativeModel(self.model)
# 添加安全设置
safety_settings = {
HarmCategory.HARM_CATEGORY_HATE_SPEECH: HarmBlockThreshold.BLOCK_NONE,
HarmCategory.HARM_CATEGORY_HARASSMENT: HarmBlockThreshold.BLOCK_NONE,
HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT: HarmBlockThreshold.BLOCK_NONE,
HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: HarmBlockThreshold.BLOCK_NONE,
}
# 生成回复,包含安全设置
response = model.generate_content(
gemini_messages,
safety_settings=safety_settings
filtered_messages = self.filter_messages(session.messages)
logger.debug(f"[Gemini] messages={filtered_messages}")
response = self.call_with_tools(
messages=filtered_messages,
tools=None,
stream=False,
model=self.model
)
if response.candidates and response.candidates[0].content:
reply_text = response.candidates[0].content.parts[0].text
logger.info(f"[Gemini] reply={reply_text}")
self.sessions.session_reply(reply_text, session_id)
return Reply(ReplyType.TEXT, reply_text)
else:
# 没有有效响应内容,可能内容被屏蔽,输出安全评分
logger.warning("[Gemini] No valid response generated. Checking safety ratings.")
if hasattr(response, 'candidates') and response.candidates:
for rating in response.candidates[0].safety_ratings:
logger.warning(f"Safety rating: {rating.category} - {rating.probability}")
error_message = "No valid response generated due to safety constraints."
if isinstance(response, dict) and response.get("error"):
error_message = response.get("message", "Failed to invoke [Gemini] api!")
logger.error(f"[Gemini] API error: {error_message}")
self.sessions.session_reply(error_message, session_id)
return Reply(ReplyType.ERROR, error_message)
choices = response.get("choices", []) if isinstance(response, dict) else []
if choices and choices[0].get("message"):
reply_text = choices[0]["message"].get("content")
if reply_text:
logger.info(f"[Gemini] reply={reply_text}")
self.sessions.session_reply(reply_text, session_id)
return Reply(ReplyType.TEXT, reply_text)
logger.warning("[Gemini] No valid response generated. Checking safety ratings.")
safety_ratings = response.get("safety_ratings", []) if isinstance(response, dict) else []
if safety_ratings:
for rating in safety_ratings:
category = rating.get("category", "UNKNOWN")
probability = rating.get("probability", "UNKNOWN")
logger.warning(f"[Gemini] Safety rating: {category} - {probability}")
error_message = "No valid response generated due to safety constraints."
self.sessions.session_reply(error_message, session_id)
return Reply(ReplyType.ERROR, error_message)
except Exception as e:
logger.error(f"[Gemini] Error generating response: {str(e)}", exc_info=True)
error_message = "Failed to invoke [Gemini] api!"
self.sessions.session_reply(error_message, session_id)
if session_id:
self.sessions.session_reply(error_message, session_id)
return Reply(ReplyType.ERROR, error_message)
def _convert_to_gemini_messages(self, messages: list):
@@ -127,6 +134,93 @@ class GoogleGeminiBot(Bot):
turn = "user"
return res
@staticmethod
def _extract_image_paths_from_text(content: str):
if not isinstance(content, str):
return "", []
pattern = r"\[图片:\s*([^\]]+)\]"
image_paths = [m.strip().strip("'\"") for m in re.findall(pattern, content) if m.strip()]
cleaned_text = re.sub(pattern, "", content)
cleaned_text = re.sub(r"\n{3,}", "\n\n", cleaned_text).strip()
return cleaned_text, image_paths
@staticmethod
def _build_image_inline_part(image_path: str):
if not image_path:
return None
try:
if image_path.startswith("file://"):
image_path = image_path[7:]
image_path = os.path.expanduser(image_path)
if not os.path.exists(image_path):
logger.warning(f"[Gemini] Image file not found: {image_path}")
return None
with open(image_path, "rb") as f:
image_bytes = f.read()
mime_type = mimetypes.guess_type(image_path)[0] or "image/png"
if not mime_type.startswith("image/"):
mime_type = "image/png"
return {
"inlineData": {
"mimeType": mime_type,
"data": base64.b64encode(image_bytes).decode("utf-8")
}
}
except Exception as e:
logger.warning(f"[Gemini] Failed to build inline image part from path={image_path}, err={e}")
return None
@staticmethod
def _build_inline_part_from_image_url(image_url):
if not image_url:
return None
if isinstance(image_url, dict):
image_url = image_url.get("url")
if not image_url or not isinstance(image_url, str):
return None
if image_url.startswith("data:"):
match = re.match(r"^data:([^;]+);base64,(.+)$", image_url, re.DOTALL)
if not match:
logger.warning("[Gemini] Invalid data URL for image block")
return None
return {
"inlineData": {
"mimeType": match.group(1),
"data": match.group(2).strip()
}
}
if image_url.startswith("file://") or os.path.exists(os.path.expanduser(image_url)):
return GoogleGeminiBot._build_image_inline_part(image_url)
if image_url.startswith("http://") or image_url.startswith("https://"):
try:
response = requests.get(image_url, timeout=20)
if response.status_code != 200:
logger.warning(f"[Gemini] Failed to fetch remote image: status={response.status_code}, url={image_url}")
return None
mime_type = response.headers.get("Content-Type", "image/png").split(";")[0].strip()
if not mime_type.startswith("image/"):
mime_type = "image/png"
return {
"inlineData": {
"mimeType": mime_type,
"data": base64.b64encode(response.content).decode("utf-8")
}
}
except Exception as e:
logger.warning(f"[Gemini] Failed to download remote image: url={image_url}, err={e}")
return None
logger.warning(f"[Gemini] Unsupported image URL format: {image_url[:120]}")
return None
def call_with_tools(self, messages, tools=None, stream=False, **kwargs):
"""
Call Gemini API with tool support using REST API (following official docs)
@@ -145,6 +239,15 @@ class GoogleGeminiBot(Bot):
# Build REST API payload
payload = {"contents": []}
inline_image_count = 0
# Keep legacy behavior: disable Gemini safety blocking like old SDK path.
payload["safetySettings"] = [
{"category": "HARM_CATEGORY_HATE_SPEECH", "threshold": "BLOCK_NONE"},
{"category": "HARM_CATEGORY_HARASSMENT", "threshold": "BLOCK_NONE"},
{"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT", "threshold": "BLOCK_NONE"},
{"category": "HARM_CATEGORY_DANGEROUS_CONTENT", "threshold": "BLOCK_NONE"},
]
# Extract and set system instruction
system_prompt = kwargs.get("system", "")
@@ -174,8 +277,19 @@ class GoogleGeminiBot(Bot):
parts = []
if isinstance(content, str):
# Simple text content
parts.append({"text": content})
# Text with optional [图片: /path/to/file] markers
cleaned_text, image_paths = self._extract_image_paths_from_text(content)
if cleaned_text:
parts.append({"text": cleaned_text})
image_added = False
for image_path in image_paths:
image_part = self._build_image_inline_part(image_path)
if image_part:
parts.append(image_part)
image_added = True
inline_image_count += 1
if not cleaned_text and not image_added and content:
parts.append({"text": content})
elif isinstance(content, list):
# List of content blocks (Claude format)
@@ -188,8 +302,39 @@ class GoogleGeminiBot(Bot):
block_type = block.get("type")
if block_type == "text":
# Text block
parts.append({"text": block.get("text", "")})
# Text block with optional image markers
block_text = block.get("text", "")
cleaned_text, image_paths = self._extract_image_paths_from_text(block_text)
if cleaned_text:
parts.append({"text": cleaned_text})
for image_path in image_paths:
image_part = self._build_image_inline_part(image_path)
if image_part:
parts.append(image_part)
elif block_type in ["image", "image_url"]:
# OpenAI format: {"type":"image_url","image_url":{"url":"..."}}
# Claude format: {"type":"image","source":{"type":"base64","media_type":"...","data":"..."}}
image_part = None
if block_type == "image":
source = block.get("source", {})
if isinstance(source, dict) and source.get("type") == "base64" and source.get("data"):
image_part = {
"inlineData": {
"mimeType": source.get("media_type", "image/png"),
"data": source.get("data")
}
}
elif block.get("image_url"):
image_part = self._build_inline_part_from_image_url(block.get("image_url"))
else:
image_part = self._build_inline_part_from_image_url(block.get("image_url"))
if image_part:
parts.append(image_part)
inline_image_count += 1
else:
logger.warning(f"[Gemini] Skip invalid image block: {str(block)[:200]}")
elif block_type == "tool_result":
# Convert Claude tool_result to Gemini functionResponse
@@ -237,6 +382,9 @@ class GoogleGeminiBot(Bot):
"role": gemini_role,
"parts": parts
})
if inline_image_count > 0:
logger.info(f"[Gemini] Multimodal request includes {inline_image_count} image part(s)")
# Generation config
gen_config = {}
@@ -363,15 +511,18 @@ class GoogleGeminiBot(Bot):
candidates = data.get("candidates", [])
if not candidates:
logger.warning("[Gemini] No candidates in response")
prompt_feedback = data.get("promptFeedback", {})
return {
"error": True,
"message": "No candidates in response",
"status_code": 500
"status_code": 500,
"safety_ratings": prompt_feedback.get("safetyRatings", [])
}
candidate = candidates[0]
content = candidate.get("content", {})
parts = content.get("parts", [])
safety_ratings = candidate.get("safetyRatings", [])
logger.debug(f"[Gemini] Candidate parts count: {len(parts)}")
@@ -419,7 +570,8 @@ class GoogleGeminiBot(Bot):
"message": message_dict,
"finish_reason": "tool_calls" if tool_calls else "stop"
}],
"usage": data.get("usageMetadata", {})
"usage": data.get("usageMetadata", {}),
"safety_ratings": safety_ratings
}
except Exception as e:

View File

@@ -1,9 +1,9 @@
# encoding:utf-8
import json
import time
import openai
import openai.error
import requests
from models.bot import Bot
from models.session_manager import SessionManager
from bridge.context import ContextType
@@ -11,10 +11,9 @@ from bridge.reply import Reply, ReplyType
from common.log import logger
from config import conf, load_config
from .moonshot_session import MoonshotSession
import requests
# ZhipuAI对话模型API
# Moonshot (Kimi) API Bot
class MoonshotBot(Bot):
def __init__(self):
super().__init__()
@@ -23,17 +22,22 @@ class MoonshotBot(Bot):
if model == "moonshot":
model = "moonshot-v1-32k"
self.args = {
"model": model, # 对话模型的名称
"temperature": conf().get("temperature", 0.3), # 如果设置,值域须为 [0, 1] 我们推荐 0.3,以达到较合适的效果。
"top_p": conf().get("top_p", 1.0), # 使用默认值
"model": model,
"temperature": conf().get("temperature", 0.3),
"top_p": conf().get("top_p", 1.0),
}
self.api_key = conf().get("moonshot_api_key")
self.base_url = conf().get("moonshot_base_url", "https://api.moonshot.cn/v1/chat/completions")
self.base_url = conf().get("moonshot_base_url", "https://api.moonshot.cn/v1")
# Ensure base_url does not end with /chat/completions (backward compat)
if self.base_url.endswith("/chat/completions"):
self.base_url = self.base_url.rsplit("/chat/completions", 1)[0]
if self.base_url.endswith("/"):
self.base_url = self.base_url.rstrip("/")
def reply(self, query, context=None):
# acquire reply content
if context.type == ContextType.TEXT:
logger.info("[MOONSHOT_AI] query={}".format(query))
logger.info("[MOONSHOT] query={}".format(query))
session_id = context["session_id"]
reply = None
@@ -50,19 +54,16 @@ class MoonshotBot(Bot):
if reply:
return reply
session = self.sessions.session_query(query, session_id)
logger.debug("[MOONSHOT_AI] session query={}".format(session.messages))
logger.debug("[MOONSHOT] session query={}".format(session.messages))
model = context.get("moonshot_model")
new_args = self.args.copy()
if model:
new_args["model"] = model
# if context.get('stream'):
# # reply in stream
# return self.reply_text_stream(query, new_query, session_id)
reply_content = self.reply_text(session, args=new_args)
logger.debug(
"[MOONSHOT_AI] new_query={}, session_id={}, reply_cont={}, completion_tokens={}".format(
"[MOONSHOT] new_query={}, session_id={}, reply_cont={}, completion_tokens={}".format(
session.messages,
session_id,
reply_content["content"],
@@ -76,17 +77,17 @@ class MoonshotBot(Bot):
reply = Reply(ReplyType.TEXT, reply_content["content"])
else:
reply = Reply(ReplyType.ERROR, reply_content["content"])
logger.debug("[MOONSHOT_AI] reply {} used 0 tokens.".format(reply_content))
logger.debug("[MOONSHOT] reply {} used 0 tokens.".format(reply_content))
return reply
else:
reply = Reply(ReplyType.ERROR, "Bot不支持处理{}类型的消息".format(context.type))
return reply
def reply_text(self, session: MoonshotSession, args=None, retry_count=0) -> dict:
def reply_text(self, session: MoonshotSession, args=None, retry_count: int = 0) -> dict:
"""
call openai's ChatCompletion to get the answer
Call Moonshot chat completion API to get the answer
:param session: a conversation session
:param session_id: session id
:param args: model args
:param retry_count: retry count
:return: {}
"""
@@ -97,10 +98,8 @@ class MoonshotBot(Bot):
}
body = args
body["messages"] = session.messages
# logger.debug("[MOONSHOT_AI] response={}".format(response))
# logger.info("[MOONSHOT_AI] reply={}, total_tokens={}".format(response.choices[0]['message']['content'], response["usage"]["total_tokens"]))
res = requests.post(
self.base_url,
f"{self.base_url}/chat/completions",
headers=headers,
json=body
)
@@ -114,14 +113,13 @@ class MoonshotBot(Bot):
else:
response = res.json()
error = response.get("error")
logger.error(f"[MOONSHOT_AI] chat failed, status_code={res.status_code}, "
logger.error(f"[MOONSHOT] chat failed, status_code={res.status_code}, "
f"msg={error.get('message')}, type={error.get('type')}")
result = {"completion_tokens": 0, "content": "提问太快啦,请休息一下再问我吧"}
need_retry = False
if res.status_code >= 500:
# server error, need retry
logger.warn(f"[MOONSHOT_AI] do retry, times={retry_count}")
logger.warn(f"[MOONSHOT] do retry, times={retry_count}")
need_retry = retry_count < 2
elif res.status_code == 401:
result["content"] = "授权失败请检查API Key是否正确"
@@ -144,3 +142,380 @@ class MoonshotBot(Bot):
return self.reply_text(session, args, retry_count + 1)
else:
return result
# ==================== Agent mode support ====================
def call_with_tools(self, messages, tools=None, stream: bool = False, **kwargs):
"""
Call Moonshot API with tool support for agent integration.
This method handles:
1. Format conversion (Claude format -> OpenAI format)
2. System prompt injection
3. Streaming SSE response with tool_calls
4. Thinking (reasoning) is disabled by default to avoid tool_choice conflicts
Args:
messages: List of messages (may be in Claude format from agent)
tools: List of tool definitions (may be in Claude format from agent)
stream: Whether to use streaming
**kwargs: Additional parameters (max_tokens, temperature, system, model, etc.)
Returns:
Generator yielding OpenAI-format chunks (for streaming)
"""
try:
# Convert messages from Claude format to OpenAI format
converted_messages = self._convert_messages_to_openai_format(messages)
# Inject system prompt if provided
system_prompt = kwargs.pop("system", None)
if system_prompt:
if not converted_messages or converted_messages[0].get("role") != "system":
converted_messages.insert(0, {"role": "system", "content": system_prompt})
else:
converted_messages[0] = {"role": "system", "content": system_prompt}
# Convert tools from Claude format to OpenAI format
converted_tools = None
if tools:
converted_tools = self._convert_tools_to_openai_format(tools)
# Resolve model / temperature
model = kwargs.pop("model", None) or self.args["model"]
max_tokens = kwargs.pop("max_tokens", None)
# Don't pop temperature, just ignore it
kwargs.pop("temperature", None)
# Build request body (omit temperature, let the API use its own default)
request_body = {
"model": model,
"messages": converted_messages,
"stream": stream,
}
if max_tokens is not None:
request_body["max_tokens"] = max_tokens
# Add tools
if converted_tools:
request_body["tools"] = converted_tools
request_body["tool_choice"] = "auto"
# Explicitly disable thinking to avoid reasoning_content issues in multi-turn tool calls.
# kimi-k2.5 may enable thinking by default; without preserving reasoning_content
# in conversation history the API will reject subsequent requests.
request_body["thinking"] = {"type": "disabled"}
logger.debug(f"[MOONSHOT] API call: model={model}, "
f"tools={len(converted_tools) if converted_tools else 0}, stream={stream}")
if stream:
return self._handle_stream_response(request_body)
else:
return self._handle_sync_response(request_body)
except Exception as e:
logger.error(f"[MOONSHOT] call_with_tools error: {e}")
import traceback
logger.error(traceback.format_exc())
def error_generator():
yield {"error": True, "message": str(e), "status_code": 500}
return error_generator()
# -------------------- streaming --------------------
def _handle_stream_response(self, request_body: dict):
"""Handle streaming SSE response from Moonshot API and yield OpenAI-format chunks."""
try:
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {self.api_key}"
}
url = f"{self.base_url}/chat/completions"
response = requests.post(url, headers=headers, json=request_body, stream=True, timeout=120)
if response.status_code != 200:
error_msg = response.text
logger.error(f"[MOONSHOT] API error: status={response.status_code}, msg={error_msg}")
yield {"error": True, "message": error_msg, "status_code": response.status_code}
return
current_tool_calls = {}
finish_reason = None
for line in response.iter_lines():
if not line:
continue
line = line.decode("utf-8")
if not line.startswith("data: "):
continue
data_str = line[6:] # Remove "data: " prefix
if data_str.strip() == "[DONE]":
break
try:
chunk = json.loads(data_str)
except json.JSONDecodeError as e:
logger.warning(f"[MOONSHOT] JSON decode error: {e}, data: {data_str[:200]}")
continue
# Check for error in chunk
if chunk.get("error"):
error_data = chunk["error"]
error_msg = error_data.get("message", "Unknown error") if isinstance(error_data, dict) else str(error_data)
logger.error(f"[MOONSHOT] stream error: {error_msg}")
yield {"error": True, "message": error_msg, "status_code": 500}
return
if not chunk.get("choices"):
continue
choice = chunk["choices"][0]
delta = choice.get("delta", {})
# Skip reasoning_content (thinking) don't log or forward
if delta.get("reasoning_content"):
continue
# Handle text content
if "content" in delta and delta["content"]:
yield {
"choices": [{
"index": 0,
"delta": {
"role": "assistant",
"content": delta["content"]
}
}]
}
# Handle tool_calls (streamed incrementally)
if "tool_calls" in delta:
for tool_call_chunk in delta["tool_calls"]:
index = tool_call_chunk.get("index", 0)
if index not in current_tool_calls:
current_tool_calls[index] = {
"id": tool_call_chunk.get("id", ""),
"type": "tool_use",
"name": tool_call_chunk.get("function", {}).get("name", ""),
"input": ""
}
# Accumulate arguments
if "function" in tool_call_chunk and "arguments" in tool_call_chunk["function"]:
current_tool_calls[index]["input"] += tool_call_chunk["function"]["arguments"]
# Yield OpenAI-format tool call delta
yield {
"choices": [{
"index": 0,
"delta": {
"tool_calls": [tool_call_chunk]
}
}]
}
# Capture finish_reason
if choice.get("finish_reason"):
finish_reason = choice["finish_reason"]
# Final chunk with finish_reason
yield {
"choices": [{
"index": 0,
"delta": {},
"finish_reason": finish_reason
}]
}
except requests.exceptions.Timeout:
logger.error("[MOONSHOT] Request timeout")
yield {"error": True, "message": "Request timeout", "status_code": 500}
except Exception as e:
logger.error(f"[MOONSHOT] stream response error: {e}")
import traceback
logger.error(traceback.format_exc())
yield {"error": True, "message": str(e), "status_code": 500}
# -------------------- sync --------------------
def _handle_sync_response(self, request_body: dict):
"""Handle synchronous API response and yield a single result dict."""
try:
headers = {
"Content-Type": "application/json",
"Authorization": f"Bearer {self.api_key}"
}
request_body.pop("stream", None)
url = f"{self.base_url}/chat/completions"
response = requests.post(url, headers=headers, json=request_body, timeout=120)
if response.status_code != 200:
error_msg = response.text
logger.error(f"[MOONSHOT] API error: status={response.status_code}, msg={error_msg}")
yield {"error": True, "message": error_msg, "status_code": response.status_code}
return
result = response.json()
message = result["choices"][0]["message"]
finish_reason = result["choices"][0]["finish_reason"]
response_data = {"role": "assistant", "content": []}
# Add text content
if message.get("content"):
response_data["content"].append({
"type": "text",
"text": message["content"]
})
# Add tool calls
if message.get("tool_calls"):
for tool_call in message["tool_calls"]:
response_data["content"].append({
"type": "tool_use",
"id": tool_call["id"],
"name": tool_call["function"]["name"],
"input": json.loads(tool_call["function"]["arguments"])
})
# Map finish_reason
if finish_reason == "tool_calls":
response_data["stop_reason"] = "tool_use"
elif finish_reason == "stop":
response_data["stop_reason"] = "end_turn"
else:
response_data["stop_reason"] = finish_reason
yield response_data
except requests.exceptions.Timeout:
logger.error("[MOONSHOT] Request timeout")
yield {"error": True, "message": "Request timeout", "status_code": 500}
except Exception as e:
logger.error(f"[MOONSHOT] sync response error: {e}")
import traceback
logger.error(traceback.format_exc())
yield {"error": True, "message": str(e), "status_code": 500}
# -------------------- format conversion --------------------
def _convert_messages_to_openai_format(self, messages):
"""
Convert messages from Claude format to OpenAI format.
Claude format uses content blocks: tool_use / tool_result / text
OpenAI format uses tool_calls in assistant, role=tool for results
"""
if not messages:
return []
converted = []
for msg in messages:
role = msg.get("role")
content = msg.get("content")
# Already a simple string pass through
if isinstance(content, str):
converted.append(msg)
continue
if not isinstance(content, list):
converted.append(msg)
continue
if role == "user":
text_parts = []
tool_results = []
for block in content:
if not isinstance(block, dict):
continue
if block.get("type") == "text":
text_parts.append(block.get("text", ""))
elif block.get("type") == "tool_result":
tool_call_id = block.get("tool_use_id") or ""
result_content = block.get("content", "")
if not isinstance(result_content, str):
result_content = json.dumps(result_content, ensure_ascii=False)
tool_results.append({
"role": "tool",
"tool_call_id": tool_call_id,
"content": result_content
})
# Tool results first (must come right after assistant with tool_calls)
for tr in tool_results:
converted.append(tr)
if text_parts:
converted.append({"role": "user", "content": "\n".join(text_parts)})
elif role == "assistant":
openai_msg = {"role": "assistant"}
text_parts = []
tool_calls = []
for block in content:
if not isinstance(block, dict):
continue
if block.get("type") == "text":
text_parts.append(block.get("text", ""))
elif block.get("type") == "tool_use":
tool_calls.append({
"id": block.get("id"),
"type": "function",
"function": {
"name": block.get("name"),
"arguments": json.dumps(block.get("input", {}))
}
})
if text_parts:
openai_msg["content"] = "\n".join(text_parts)
elif not tool_calls:
openai_msg["content"] = ""
if tool_calls:
openai_msg["tool_calls"] = tool_calls
if not text_parts:
openai_msg["content"] = None
converted.append(openai_msg)
else:
converted.append(msg)
return converted
def _convert_tools_to_openai_format(self, tools):
"""
Convert tools from Claude format to OpenAI format.
Claude: {name, description, input_schema}
OpenAI: {type: "function", function: {name, description, parameters}}
"""
if not tools:
return None
converted = []
for tool in tools:
# Already in OpenAI format
if "type" in tool and tool["type"] == "function":
converted.append(tool)
else:
converted.append({
"type": "function",
"function": {
"name": tool.get("name"),
"description": tool.get("description"),
"parameters": tool.get("input_schema", {})
}
})
return converted

View File

@@ -310,13 +310,9 @@ class ZHIPUAIBot(Bot, ZhipuAIImage):
if hasattr(delta, 'content') and delta.content:
openai_chunk["choices"][0]["delta"]["content"] = delta.content
# Add reasoning_content if present (GLM-4.7 specific)
# Add reasoning_content as separate field if present (GLM-5/GLM-4.7 thinking)
if hasattr(delta, 'reasoning_content') and delta.reasoning_content:
# Store reasoning in content or metadata
if "content" not in openai_chunk["choices"][0]["delta"]:
openai_chunk["choices"][0]["delta"]["content"] = ""
# Prepend reasoning to content
openai_chunk["choices"][0]["delta"]["content"] = delta.reasoning_content + openai_chunk["choices"][0]["delta"].get("content", "")
openai_chunk["choices"][0]["delta"]["reasoning_content"] = delta.reasoning_content
# Add tool_calls if present
if hasattr(delta, 'tool_calls') and delta.tool_calls:

View File

@@ -1,4 +1,5 @@
openai==0.27.8
aiohttp>=3.8.6,<3.10
HTMLParser>=0.0.2
PyQRCode==1.2.1
qrcode==7.4.2

80
run.sh
View File

@@ -270,24 +270,26 @@ select_model() {
echo -e "${CYAN}${BOLD}=========================================${NC}"
echo -e "${CYAN}${BOLD} Select AI Model${NC}"
echo -e "${CYAN}${BOLD}=========================================${NC}"
echo -e "${YELLOW}1) MiniMax (MiniMax-M2.1, MiniMax-M2.1-lightning, etc.)${NC}"
echo -e "${YELLOW}2) Zhipu AI (glm-4.7, glm-4.6, etc.)${NC}"
echo -e "${YELLOW}3) Qwen (qwen3-max, qwen-plus, qwq-plus, etc.)${NC}"
echo -e "${YELLOW}4) Claude (claude-sonnet-4-5, claude-opus-4-0, etc.)${NC}"
echo -e "${YELLOW}5) Gemini (gemini-3-flash-preview, gemini-2.5-pro, etc.)${NC}"
echo -e "${YELLOW}6) OpenAI GPT (gpt-5.2, gpt-4.1, etc.)${NC}"
echo -e "${YELLOW}7) LinkAI (access multiple models via one API)${NC}"
echo -e "${YELLOW}1) MiniMax (MiniMax-M2.5, MiniMax-M2.1, etc.)${NC}"
echo -e "${YELLOW}2) Zhipu AI (glm-5, glm-4.7, etc.)${NC}"
echo -e "${YELLOW}3) Kimi (kimi-k2.5, kimi-k2, etc.)${NC}"
echo -e "${YELLOW}4) Doubao (doubao-seed-2-0-code-preview-260215, etc.)${NC}"
echo -e "${YELLOW}5) Qwen (qwen3.5-plus, qwen3-max, qwq-plus, etc.)${NC}"
echo -e "${YELLOW}6) Claude (claude-sonnet-4-6, claude-opus-4-6, etc.)${NC}"
echo -e "${YELLOW}7) Gemini (gemini-3.1-pro-preview, gemini-3-flash-preview, etc.)${NC}"
echo -e "${YELLOW}8) OpenAI GPT (gpt-5.2, gpt-4.1, etc.)${NC}"
echo -e "${YELLOW}9) LinkAI (access multiple models via one API)${NC}"
echo ""
while true; do
read -p "Enter your choice [press Enter for default: 1 - MiniMax]: " model_choice
model_choice=${model_choice:-1}
case "$model_choice" in
1|2|3|4|5|6|7)
1|2|3|4|5|6|7|8|9)
break
;;
*)
echo -e "${RED}Invalid choice. Please enter 1-7.${NC}"
echo -e "${RED}Invalid choice. Please enter 1-9.${NC}"
;;
esac
done
@@ -300,8 +302,8 @@ configure_model() {
# MiniMax
echo -e "${GREEN}Configuring MiniMax...${NC}"
read -p "Enter MiniMax API Key: " minimax_key
read -p "Enter model name [press Enter for default: MiniMax-M2.1]: " model_name
model_name=${model_name:-MiniMax-M2.1}
read -p "Enter model name [press Enter for default: MiniMax-M2.5]: " model_name
model_name=${model_name:-MiniMax-M2.5}
MODEL_NAME="$model_name"
MINIMAX_KEY="$minimax_key"
@@ -310,28 +312,48 @@ configure_model() {
# Zhipu AI
echo -e "${GREEN}Configuring Zhipu AI...${NC}"
read -p "Enter Zhipu AI API Key: " zhipu_key
read -p "Enter model name [press Enter for default: glm-4.7]: " model_name
model_name=${model_name:-glm-4.7}
read -p "Enter model name [press Enter for default: glm-5]: " model_name
model_name=${model_name:-glm-5}
MODEL_NAME="$model_name"
ZHIPU_KEY="$zhipu_key"
;;
3)
# Kimi (Moonshot)
echo -e "${GREEN}Configuring Kimi (Moonshot)...${NC}"
read -p "Enter Moonshot API Key: " moonshot_key
read -p "Enter model name [press Enter for default: kimi-k2.5]: " model_name
model_name=${model_name:-kimi-k2.5}
MODEL_NAME="$model_name"
MOONSHOT_KEY="$moonshot_key"
;;
4)
# Doubao (Volcengine Ark)
echo -e "${GREEN}Configuring Doubao (Volcengine Ark)...${NC}"
read -p "Enter Ark API Key: " ark_key
read -p "Enter model name [press Enter for default: doubao-seed-2-0-code-preview-260215]: " model_name
model_name=${model_name:-doubao-seed-2-0-code-preview-260215}
MODEL_NAME="$model_name"
ARK_KEY="$ark_key"
;;
5)
# Qwen (DashScope)
echo -e "${GREEN}Configuring Qwen (DashScope)...${NC}"
read -p "Enter DashScope API Key: " dashscope_key
read -p "Enter model name [press Enter for default: qwen3-max]: " model_name
model_name=${model_name:-qwen3-max}
read -p "Enter model name [press Enter for default: qwen3.5-plus]: " model_name
model_name=${model_name:-qwen3.5-plus}
MODEL_NAME="$model_name"
DASHSCOPE_KEY="$dashscope_key"
;;
4)
6)
# Claude
echo -e "${GREEN}Configuring Claude...${NC}"
read -p "Enter Claude API Key: " claude_key
read -p "Enter model name [press Enter for default: claude-sonnet-4-5]: " model_name
model_name=${model_name:-claude-sonnet-4-5}
read -p "Enter model name [press Enter for default: claude-sonnet-4-6]: " model_name
model_name=${model_name:-claude-sonnet-4-6}
read -p "Enter API Base URL [press Enter for default: https://api.anthropic.com/v1]: " api_base
api_base=${api_base:-https://api.anthropic.com/v1}
@@ -339,12 +361,12 @@ configure_model() {
CLAUDE_KEY="$claude_key"
CLAUDE_BASE="$api_base"
;;
5)
7)
# Gemini
echo -e "${GREEN}Configuring Gemini...${NC}"
read -p "Enter Gemini API Key: " gemini_key
read -p "Enter model name [press Enter for default: gemini-3-flash-preview]: " model_name
model_name=${model_name:-gemini-3-flash-preview}
read -p "Enter model name [press Enter for default: gemini-3.1-pro-preview]: " model_name
model_name=${model_name:-gemini-3.1-pro-preview}
read -p "Enter API Base URL [press Enter for default: https://generativelanguage.googleapis.com]: " api_base
api_base=${api_base:-https://generativelanguage.googleapis.com}
@@ -352,7 +374,7 @@ configure_model() {
GEMINI_KEY="$gemini_key"
GEMINI_BASE="$api_base"
;;
6)
8)
# OpenAI
echo -e "${GREEN}Configuring OpenAI GPT...${NC}"
read -p "Enter OpenAI API Key: " openai_key
@@ -365,12 +387,12 @@ configure_model() {
OPENAI_KEY="$openai_key"
OPENAI_BASE="$api_base"
;;
7)
9)
# LinkAI
echo -e "${GREEN}Configuring LinkAI...${NC}"
read -p "Enter LinkAI API Key: " linkai_key
read -p "Enter model name [press Enter for default: MiniMax-M2.1]: " model_name
model_name=${model_name:-MiniMax-M2.1}
read -p "Enter model name [press Enter for default: MiniMax-M2.5]: " model_name
model_name=${model_name:-MiniMax-M2.5}
MODEL_NAME="$model_name"
USE_LINKAI="true"
@@ -483,6 +505,8 @@ create_config_file() {
"gemini_api_key": "${GEMINI_KEY:-}",
"gemini_api_base": "${GEMINI_BASE:-https://generativelanguage.googleapis.com}",
"zhipu_ai_api_key": "${ZHIPU_KEY:-}",
"moonshot_api_key": "${MOONSHOT_KEY:-}",
"ark_api_key": "${ARK_KEY:-}",
"dashscope_api_key": "${DASHSCOPE_KEY:-}",
"minimax_api_key": "${MINIMAX_KEY:-}",
"voice_to_text": "openai",
@@ -518,6 +542,8 @@ EOF
"gemini_api_key": "${GEMINI_KEY:-}",
"gemini_api_base": "${GEMINI_BASE:-https://generativelanguage.googleapis.com}",
"zhipu_ai_api_key": "${ZHIPU_KEY:-}",
"moonshot_api_key": "${MOONSHOT_KEY:-}",
"ark_api_key": "${ARK_KEY:-}",
"dashscope_api_key": "${DASHSCOPE_KEY:-}",
"minimax_api_key": "${MINIMAX_KEY:-}",
"voice_to_text": "openai",
@@ -552,6 +578,8 @@ EOF
"gemini_api_key": "${GEMINI_KEY:-}",
"gemini_api_base": "${GEMINI_BASE:-https://generativelanguage.googleapis.com}",
"zhipu_ai_api_key": "${ZHIPU_KEY:-}",
"moonshot_api_key": "${MOONSHOT_KEY:-}",
"ark_api_key": "${ARK_KEY:-}",
"dashscope_api_key": "${DASHSCOPE_KEY:-}",
"minimax_api_key": "${MINIMAX_KEY:-}",
"voice_to_text": "openai",
@@ -592,6 +620,8 @@ EOF
"gemini_api_key": "${GEMINI_KEY:-}",
"gemini_api_base": "${GEMINI_BASE:-https://generativelanguage.googleapis.com}",
"zhipu_ai_api_key": "${ZHIPU_KEY:-}",
"moonshot_api_key": "${MOONSHOT_KEY:-}",
"ark_api_key": "${ARK_KEY:-}",
"dashscope_api_key": "${DASHSCOPE_KEY:-}",
"minimax_api_key": "${MINIMAX_KEY:-}",
"voice_to_text": "openai",