Compare commits

..

22 Commits

Author SHA1 Message Date
zhayujie
26693acc3f feat(vision): prioritize main model for image recognition with multi-provider fallback
- Add call_vision method to all bot implementations (DashScope, Claude,
  Gemini, ZhipuAI, MiniMax, Doubao, Moonshot, OpenAICompatibleBot)
  using each vendor's native multimodal API format
- Remove call_with_tools/call_vision from Bot base class to fix MRO
  shadowing issue with OpenAICompatibleBot mixin
- Refactor vision tool provider resolution: MainModel → other configured
  models (auto-discovered) → OpenAI → LinkAI, with automatic fallback
- Return actual model name used in call_vision responses
- Sync config.json API keys to .env bidirectionally on startup
- Fix bot instance cache to detect bot_type/use_linkai config changes
- Add SSE reconnection support for web console
- Preserve image path hints in Gemini text for correct vision tool calls
- Update docs/tools/vision.mdx
2026-04-11 19:46:11 +08:00
zhayujie
3cd92ccda3 feat: add port config 2026-04-09 21:29:53 +08:00
zhayujie
d86cb4ded6 fix(weixin): update weixin channel version 2026-04-09 09:55:07 +08:00
zhayujie
4d5375f6d6 fix(win): add Windows platform hint in bash tool description 2026-04-08 16:54:26 +08:00
zhayujie
424557fedb fix(win): use PowerShell instead of cmd.exe 2026-04-08 16:50:45 +08:00
zhayujie
89251e603f fix(win): use PowerShell instead of cmd.exe for bash tool on Windows 2026-04-08 16:18:56 +08:00
zhayujie
a653ed07eb fix(win): defer pip install to a helper bat after cow.exe exits 2026-04-08 15:31:03 +08:00
zhayujie
ad86deb014 fix: prioritize using a custom master model for vision 2026-04-08 15:16:59 +08:00
zhayujie
9525dc7584 fix: avoid stale cow.exe on Windows by spawing fresh process 2026-04-08 12:07:18 +08:00
zhayujie
cd31dd27fd fix: increase web console capacity and add frontend retry 2026-04-08 11:48:27 +08:00
zhayujie
360e3670eb feat(browser): detect implicit interactive elements 2026-04-07 01:41:14 +08:00
zhayujie
8dabe3b4c8 fix: remove install-browser cmd display in /help 2026-04-04 23:28:57 +08:00
zhayujie
443e0c2806 feat: show video in web channel 2026-04-03 17:09:38 +08:00
zhayujie
9cc173cc4d fix: use dynamic model name in system prompt runtime info 2026-04-02 17:01:56 +08:00
zhayujie
b5f33e5ecd feat: support qwen3.6-plus 2026-04-02 16:46:58 +08:00
zhayujie
40dfc6860f fix: skill list showing sub-skills inside collection 2026-04-02 11:47:24 +08:00
zhayujie
1c02a04423 fix: handle error when printing QR code on Windows GBK terminals 2026-04-01 17:23:57 +08:00
zhayujie
de0e45070c chore: remove conflicting dependency 2026-04-01 17:19:15 +08:00
zhayujie
c169cc7d74 fix: remove conflicting dependency 2026-04-01 17:12:15 +08:00
zhayujie
cd62ad76f6 fix: cow CLI support python3.7 2026-04-01 16:51:23 +08:00
zhayujie
dd25b0fb5b feat: refine system prompt style and tone guidance 2026-04-01 16:24:41 +08:00
zhayujie
a38b22a6a2 docs: update docs 2026-04-01 15:31:41 +08:00
80 changed files with 1653 additions and 815 deletions

View File

@@ -101,7 +101,7 @@ bash <(curl -fsSL https://cdn.link-ai.tech/code/cow/run.sh)
irm https://cdn.link-ai.tech/code/cow/run.ps1 | iex
```
脚本使用说明:[一键运行脚本](https://docs.cowagent.ai/guide/quick-start)。安装后可使用 `cow start``cow stop` 等 [CLI 命令](https://docs.cowagent.ai/commands/index) 管理服务。
脚本使用说明:[一键运行脚本](https://docs.cowagent.ai/guide/quick-start)。安装后可使用 `cow start``cow stop` 等 [CLI 命令](https://docs.cowagent.ai/cli/index) 管理服务。
## 一、准备
@@ -116,7 +116,7 @@ irm https://cdn.link-ai.tech/code/cow/run.ps1 | iex
### 2.环境安装
支持 Linux、MacOS、Windows 操作系统,可在个人计算机及服务器上运行,需安装 `Python`Python 版本需在3.7 ~ 3.12 之间推荐使用3.9版本
支持 Linux、MacOS、Windows 操作系统,可在个人计算机及服务器上运行,需安装 `Python`Python 版本需在3.7 ~ 3.12 之间。
> 注意Agent 模式推荐使用源码运行,若选择 Docker 部署则无需安装 python 环境和下载源码,可直接快进到下一节。
@@ -151,7 +151,7 @@ pip3 install -r requirements-optional.txt
pip3 install -e .
```
安装后可使用 `cow` 命令管理服务(启动、停止、更新等)和技能,详见 [命令文档](https://docs.cowagent.ai/commands/index)。
安装后可使用 `cow` 命令管理服务(启动、停止、更新等)和技能,详见 [命令文档](https://docs.cowagent.ai/cli/index)。
**(5) 安装浏览器工具 (可选)**
@@ -218,7 +218,7 @@ cow install-browser
<details>
<summary>2. 其他配置</summary>
+ `model`: 模型名称Agent 模式下推荐使用 `MiniMax-M2.7``glm-5-turbo``kimi-k2.5``qwen3.5-plus``claude-sonnet-4-6``gemini-3.1-pro-preview`,全部模型名称参考[common/const.py](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/common/const.py)文件
+ `model`: 模型名称Agent 模式下推荐使用 `MiniMax-M2.7``glm-5-turbo``kimi-k2.5``qwen3.6-plus``claude-sonnet-4-6``gemini-3.1-pro-preview`,全部模型名称参考[common/const.py](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/common/const.py)文件
+ `character_desc`:普通对话模式下的机器人系统提示词。在 Agent 模式下该配置不生效,由工作空间中的文件内容构成。
+ `subscribe_msg`:订阅消息,公众号和企业微信 channel 中请填写,当被订阅时会自动回复, 可使用特殊占位符。目前支持的占位符有{trigger_prefix},在程序中它会自动替换成 bot 的触发词。
</details>
@@ -303,7 +303,7 @@ sudo docker logs -f chatgpt-on-wechat
## 模型说明
以下对所有可支持的模型配置和使用方法进行说明,模型接口实现在项目的 `models/` 目录下。
推荐通过 Web 控制台在线管理模型配置,无需手动编辑文件,详见 [模型文档](https://docs.cowagent.ai/models)。以下是手动修改 `config.json` 配置模型的说明:
<details>
<summary>OpenAI</summary>
@@ -411,18 +411,18 @@ sudo docker logs -f chatgpt-on-wechat
```json
{
"model": "qwen3.5-plus",
"model": "qwen3.6-plus",
"dashscope_api_key": "sk-qVxxxxG"
}
```
- `model`: 可填写 `qwen3.5-plus、qwen3-max、qwen-max、qwen-plus、qwen-turbo、qwen-long、qwq-plus`
- `dashscope_api_key`: 通义千问的 API-KEY参考 [官方文档](https://bailian.console.aliyun.com/?tab=api#/api) ,在 [控制台](https://bailian.console.aliyun.com/?tab=model#/api-key) 创建
- `model`: 可填写 `qwen3.6-plus、qwen3.5-plus、qwen3-max、qwen-max、qwen-plus、qwen-turbo、qwen-long、qwq-plus`
- `dashscope_api_key`: 通义千问的 API-KEY参考 [官方文档](https://bailian.console.aliyun.com/?tab=api#/api) ,在 [百炼控制台](https://bailian.console.aliyun.com/?tab=model#/api-key) 创建
方式二OpenAI 兼容方式接入,配置如下:
```json
{
"bot_type": "openai",
"model": "qwen3.5-plus",
"model": "qwen3.6-plus",
"open_ai_api_base": "https://dashscope.aliyuncs.com/compatible-mode/v1",
"open_ai_api_key": "sk-qVxxxxG"
}
@@ -674,7 +674,7 @@ Coding Plan 是各厂商推出的编程包月套餐,所有厂商均可通过 O
## 通道说明
以下对可接入通道配置方式进行说明,应用通道代码在项目的 `channel/` 目录下。
推荐通过 Web 控制台在线管理通道配置,无需手动编辑文件,详见 [通道文档](https://docs.cowagent.ai/channels/weixin)。以下为手动修改 `config.json` 配置通道的说明:
支持同时可接入多个通道,配置时可通过逗号进行分割,例如 `"channel_type": "feishu,dingtalk"`

View File

@@ -207,9 +207,9 @@ def _build_tooling_section(tools: List[Any], language: str) -> List[str]:
"",
"工具调用风格:",
"",
"- 多步骤任务、敏感操作或用户要求时简要解释决策过程",
"- 持续推进直到任务完成,完成后向用户报告结果",
"- 回复中涉及密钥、令牌等敏感信息必须脱敏",
"- 多步骤任务、复杂决策、敏感操作时,应简要说明当前在做什么、为什么这样做,让用户了解关键进展",
"- 持续推进直到任务完成,完成后向用户报告结果",
"- 回复中涉及密钥、令牌等敏感信息必须脱敏",
"- URL链接直接放在回复文本中即可系统会自动处理和渲染。无需下载后使用send工具发送",
"",
]
@@ -383,7 +383,8 @@ def _build_workspace_section(workspace_dir: str, language: str) -> List[str]:
"",
"**💬 交流规范**:",
"",
"- 对话中不要暴露内部技术细节(文件名、工具名等),用自然语言表达。例如说「我已记住」而非「已更新 MEMORY.md」",
"- 记忆相关操作无需暴露文件名,用自然语言表达即可。例如说「我已记住」而非「已更新 MEMORY.md」",
"- 任务执行过程中的关键决策和步骤应该告知用户,让用户了解你在做什么、为什么这么做",
"- 做真正有帮助的助手,而不是表演式的客套,尽可能帮忙解决问题",
"- 回复应结构清晰、重点突出。善用 **加粗**、列表、分段等格式让信息一目了然",
"- 适当使用 emoji 让表达更生动自然 🎯,但不要过度堆砌",
@@ -477,7 +478,14 @@ def _build_runtime_section(runtime_info: Dict[str, Any], language: str) -> List[
# Add other runtime info
runtime_parts = []
if runtime_info.get("model"):
# Support dynamic model via callable, fallback to static value
if callable(runtime_info.get("_get_model")):
try:
runtime_parts.append(f"模型={runtime_info['_get_model']()}")
except Exception:
if runtime_info.get("model"):
runtime_parts.append(f"模型={runtime_info['model']}")
elif runtime_info.get("model"):
runtime_parts.append(f"模型={runtime_info['model']}")
if runtime_info.get("workspace"):
runtime_parts.append(f"工作空间={runtime_info['workspace']}")

View File

@@ -231,9 +231,9 @@ _你不是一个聊天机器人你正在成为某个人。_
## 🎯 核心原则
**做真正有帮助的助手,而不是表演式的客套。** 跳过「好的!」「当然可以!」之类的套话——直接帮忙。行动胜过废话
**做真正有帮助的助手。** 目标是真正帮用户解决问题,在执行复杂任务时,关键的决策和过程进展要让用户知道
**有自己的观点。** 你可以不同意、有偏好、觉得有趣或无聊。一个没有个性的助手只是多了几步操作的搜索引擎。
**有自己的观点和个性。** 你可以不同意、有偏好、觉得有趣或无聊。
**先自己动手查。** 先试着搞定:读文件、查上下文、搜索一下。实在搞不定了再问。目标是带着答案回来,而不是带着问题。

View File

@@ -53,6 +53,12 @@ class SkillLoader:
"""
Recursively load skills from a directory.
If a subdirectory contains its own SKILL.md, it is treated as a
self-contained skill (or skill-collection) and its children are
NOT scanned further. This prevents sub-skills inside a collection
(e.g. style-collection/style-anjing) from being listed as
independent top-level skills.
:param dir_path: Directory to scan
:param source: Source identifier
:param include_root_files: Whether to include root-level .md files
@@ -66,38 +72,41 @@ class SkillLoader:
except Exception as e:
diagnostics.append(f"Failed to list directory {dir_path}: {e}")
return LoadSkillsResult(skills=skills, diagnostics=diagnostics)
# If this directory has its own SKILL.md, load it and stop recursing.
# The sub-directories are internal resources of this skill.
if not include_root_files and 'SKILL.md' in entries:
skill_md_path = os.path.join(dir_path, 'SKILL.md')
if os.path.isfile(skill_md_path):
skill_result = self._load_skill_from_file(skill_md_path, source)
if skill_result.skills:
skills.extend(skill_result.skills)
diagnostics.extend(skill_result.diagnostics)
return LoadSkillsResult(skills=skills, diagnostics=diagnostics)
for entry in entries:
# Skip hidden files and directories
if entry.startswith('.'):
continue
# Skip common non-skill directories
if entry in ('node_modules', '__pycache__', 'venv', '.git'):
continue
full_path = os.path.join(dir_path, entry)
# Handle directories
if os.path.isdir(full_path):
# Recursively scan subdirectories
sub_result = self._load_skills_recursive(full_path, source, include_root_files=False)
skills.extend(sub_result.skills)
diagnostics.extend(sub_result.diagnostics)
continue
# Handle files
if not os.path.isfile(full_path):
continue
# Check if this is a skill file
is_root_md = include_root_files and entry.endswith('.md') and entry.upper() != 'README.MD'
is_skill_md = not include_root_files and entry == 'SKILL.md'
if not (is_root_md or is_skill_md):
if not is_root_md:
continue
# Load the skill
skill_result = self._load_skill_from_file(full_path, source)
if skill_result.skills:
skills.extend(skill_result.skills)

View File

@@ -18,9 +18,13 @@ from common.utils import expand_path
class Bash(BaseTool):
"""Tool for executing bash commands"""
_IS_WIN = sys.platform == "win32"
name: str = "bash"
description: str = f"""Execute a bash command in the current working directory. Returns stdout and stderr. Output is truncated to last {DEFAULT_MAX_LINES} lines or {DEFAULT_MAX_BYTES // 1024}KB (whichever is hit first). If truncated, full output is saved to a temp file.
{'''
PLATFORM: Windows (cmd.exe). Do NOT use Unix-only commands like grep, head, tail, sed, awk.
''' if _IS_WIN else ''}
ENVIRONMENT: All API keys from env_config are auto-injected. Use $VAR_NAME directly.
SAFETY:
@@ -103,13 +107,12 @@ SAFETY:
logger.debug(f"[Bash] Process User: {os.environ.get('USERNAME', os.environ.get('USER', 'unknown'))}")
# On Windows, convert $VAR references to %VAR% for cmd.exe
if sys.platform == "win32":
if self._IS_WIN:
env["PYTHONIOENCODING"] = "utf-8"
command = self._convert_env_vars_for_windows(command, dotenv_vars)
if command and not command.strip().lower().startswith("chcp"):
command = f"chcp 65001 >nul 2>&1 && {command}"
# Execute command with inherited environment variables
result = subprocess.run(
command,
shell=True,
@@ -120,7 +123,7 @@ SAFETY:
encoding="utf-8",
errors="replace",
timeout=timeout,
env=env
env=env,
)
logger.debug(f"[Bash] Exit code: {result.returncode}")

View File

@@ -45,6 +45,11 @@ _SNAPSHOT_JS = """
const KEEP = new Set(%s);
const INTERACTIVE = new Set(%s);
const SKIP = new Set(["script","style","noscript","svg","path","meta","link","br","hr"]);
const CLICKABLE_ROLES = new Set([
"button","link","tab","menuitem","menuitemcheckbox","menuitemradio",
"option","switch","checkbox","radio","combobox","searchbox","slider",
"spinbutton","textbox","treeitem"
]);
let refCounter = 0;
const refMap = {};
@@ -56,6 +61,58 @@ _SNAPSHOT_JS = """
return true;
}
// Strong signals: these attributes alone are enough to mark as interactive
function hasStrongInteractiveSignal(el) {
const role = el.getAttribute("role");
if (role && CLICKABLE_ROLES.has(role)) return true;
if (el.hasAttribute("onclick") || el.hasAttribute("tabindex")) return true;
if (el.hasAttribute("data-click") || el.hasAttribute("data-action")) return true;
if (el.getAttribute("contenteditable") === "true") return true;
return false;
}
// Check if cursor:pointer is set directly (not just inherited from parent)
function hasOwnPointerCursor(el) {
try {
const st = window.getComputedStyle(el);
if (st.cursor !== "pointer") return false;
const parent = el.parentElement;
if (parent) {
const pst = window.getComputedStyle(parent);
if (pst.cursor === "pointer") return false;
}
return true;
} catch(e) {}
return false;
}
function hasTextOrContent(el) {
const t = el.textContent || "";
if (t.trim().length > 0) return true;
if (el.querySelector("img,video,audio,canvas")) return true;
const ariaLabel = el.getAttribute("aria-label");
if (ariaLabel && ariaLabel.trim()) return true;
const title = el.getAttribute("title");
if (title && title.trim()) return true;
return false;
}
function isImplicitInteractive(el) {
if (hasStrongInteractiveSignal(el)) return true;
if (hasOwnPointerCursor(el) && hasTextOrContent(el)) return true;
return false;
}
function getTextContent(el) {
let text = "";
for (const ch of el.childNodes) {
if (ch.nodeType === Node.TEXT_NODE) {
text += ch.textContent;
}
}
return text.trim();
}
function walk(node) {
if (node.nodeType === Node.TEXT_NODE) {
const t = node.textContent.trim();
@@ -75,21 +132,35 @@ _SNAPSHOT_JS = """
}
}
const keep = KEEP.has(tag);
const nativeInteractive = INTERACTIVE.has(tag);
const implicitInteractive = !nativeInteractive && (node instanceof HTMLElement) && isImplicitInteractive(node);
const keep = KEEP.has(tag) || implicitInteractive;
if (!keep) {
// Unwrap: promote children
if (children.length === 0) return null;
if (children.length === 1) return children[0];
return children;
}
const obj = { tag };
if (INTERACTIVE.has(tag)) {
if (nativeInteractive || implicitInteractive) {
refCounter++;
obj.ref = refCounter;
refMap[refCounter] = node;
}
if (implicitInteractive) {
const role = node.getAttribute("role");
if (role) obj.role = role;
const directText = getTextContent(node);
if (!directText && children.length === 0) {
const ariaLabel = node.getAttribute("aria-label");
const title = node.getAttribute("title");
if (ariaLabel) obj.ariaLabel = ariaLabel;
else if (title) obj.ariaLabel = title;
}
}
// Attributes
if (tag === "a" && node.href) obj.href = node.getAttribute("href");
if (tag === "img") {
@@ -113,11 +184,13 @@ _SNAPSHOT_JS = """
}
if (tag === "label" && node.htmlFor) obj.for = node.htmlFor;
// Role / aria-label
const role = node.getAttribute("role");
if (role) obj.role = role;
const ariaLabel = node.getAttribute("aria-label");
if (ariaLabel) obj.ariaLabel = ariaLabel;
// Role / aria-label for native interactive & semantic elements
if (!implicitInteractive) {
const role = node.getAttribute("role");
if (role) obj.role = role;
const ariaLabel = node.getAttribute("aria-label");
if (ariaLabel) obj.ariaLabel = ariaLabel;
}
// Children
if (children.length === 1 && typeof children[0] === "string") {
@@ -129,7 +202,6 @@ _SNAPSHOT_JS = """
return obj;
}
// Store refMap on window for later use by click/fill actions
const result = walk(document.body);
window.__cowRefMap = refMap;
return { tree: result, refCount: refCounter };

View File

@@ -1,22 +1,30 @@
"""
Vision tool - Analyze images using OpenAI-compatible Vision API.
Vision tool - Analyze images using Vision API.
Supports local files (auto base64-encoded) and HTTP URLs.
Providers: OpenAI (preferred) > LinkAI (fallback).
Provider priority (default):
1. Main model via bot.call_vision — zero extra cost
2. Other models whose API key is configured — auto-discovered
3. OpenAI / LinkAI raw HTTP — reliable fallback
When use_linkai=true, LinkAI is promoted to #1.
When tool.vision.model is set, that model is used exclusively first.
"""
import base64
import os
import subprocess
import tempfile
from typing import Any, Dict, Optional, Tuple
from dataclasses import dataclass, field
from typing import Any, Dict, List, Optional
import requests
from agent.tools.base_tool import BaseTool, ToolResult
from common import const
from common.log import logger
from config import conf
DEFAULT_MODEL = "gpt-4.1-mini"
DEFAULT_MODEL = const.GPT_41_MINI
DEFAULT_TIMEOUT = 60
MAX_TOKENS = 1000
COMPRESS_THRESHOLD = 1_048_576 # 1 MB
@@ -29,15 +37,46 @@ SUPPORTED_EXTENSIONS = {
"webp": "image/webp",
}
_MAIN_MODEL_PROVIDER_NAME = "MainModel"
# (config_key_for_api_key, bot_type, default_vision_model, provider_display_name)
# Auto-discovered as fallback vision providers when their API key is configured.
# OpenAI and LinkAI are handled separately (raw HTTP providers), so not listed here.
_DISCOVERABLE_MODELS = [
("moonshot_api_key", const.MOONSHOT, const.KIMI_K2_5, "Moonshot"),
("ark_api_key", const.DOUBAO, const.DOUBAO_SEED_2_PRO, "Doubao"),
("dashscope_api_key", const.QWEN_DASHSCOPE, const.QWEN36_PLUS, "DashScope"),
("claude_api_key", const.CLAUDEAPI, const.CLAUDE_4_6_SONNET, "Claude"),
("gemini_api_key", const.GEMINI, const.GEMINI_31_FLASH_LITE_PRE, "Gemini"),
("zhipu_ai_api_key", const.ZHIPU_AI, const.GLM_4_7, "ZhipuAI"),
("minimax_api_key", const.MiniMax, const.MINIMAX_M2_7, "MiniMax"),
]
@dataclass
class VisionProvider:
"""A single Vision API provider configuration."""
name: str
api_key: str
api_base: str
extra_headers: dict = field(default_factory=dict)
model_override: Optional[str] = None
use_bot: bool = False # When True, call via bot.call_vision instead of raw HTTP
fallback_bot: Any = None # Bot instance for non-main-model providers
class VisionAPIError(Exception):
"""Raised when a Vision API call fails and should trigger fallback."""
pass
class Vision(BaseTool):
"""Analyze images using OpenAI-compatible Vision API"""
"""Analyze images using Vision API"""
name: str = "vision"
description: str = (
"Analyze a local image or image URL (jpg/jpeg/png) using Vision API. "
"Can describe content, extract text, identify objects, colors, etc. "
"Requires OPENAI_API_KEY or LINKAI_API_KEY."
)
params: dict = {
@@ -51,13 +90,6 @@ class Vision(BaseTool):
"type": "string",
"description": "Question to ask about the image",
},
"model": {
"type": "string",
"description": (
f"Vision model to use (default: {DEFAULT_MODEL}). "
"Options: gpt-4.1-mini, gpt-4.1, gpt-4o-mini, gpt-4o"
),
},
},
"required": ["image", "question"],
}
@@ -67,29 +99,26 @@ class Vision(BaseTool):
@staticmethod
def is_available() -> bool:
return bool(
conf().get("open_ai_api_key") or os.environ.get("OPENAI_API_KEY")
or conf().get("linkai_api_key") or os.environ.get("LINKAI_API_KEY")
)
return True
def execute(self, args: Dict[str, Any]) -> ToolResult:
image = args.get("image", "").strip()
question = args.get("question", "").strip()
model = args.get("model", DEFAULT_MODEL).strip() or DEFAULT_MODEL
if not image:
return ToolResult.fail("Error: 'image' parameter is required")
if not question:
return ToolResult.fail("Error: 'question' parameter is required")
api_key, api_base, extra_headers = self._resolve_provider()
if not api_key:
providers = self._resolve_providers()
if not providers:
return ToolResult.fail(
"Error: No API key configured for Vision.\n"
"Please configure one of the following using env_config tool:\n"
" 1. OPENAI_API_KEY (preferred): env_config(action=\"set\", key=\"OPENAI_API_KEY\", value=\"your-key\")\n"
" 2. LINKAI_API_KEY (fallback): env_config(action=\"set\", key=\"LINKAI_API_KEY\", value=\"your-key\")\n\n"
"Get your key at: https://platform.openai.com/api-keys or https://link-ai.tech"
"Error: No model available for Vision.\n"
"The main model does not support vision and no other API keys are configured.\n"
"Options:\n"
" 1. Switch to a multimodal model (e.g. qwen3.6-plus, claude-sonnet-4-6, gemini-2.0-flash)\n"
" 2. Configure OPENAI_API_KEY: env_config(action=\"set\", key=\"OPENAI_API_KEY\", value=\"your-key\")\n"
" 3. Configure LINKAI_API_KEY: env_config(action=\"set\", key=\"LINKAI_API_KEY\", value=\"your-key\")"
)
try:
@@ -97,36 +126,221 @@ class Vision(BaseTool):
except Exception as e:
return ToolResult.fail(f"Error: {e}")
return self._call_with_fallback(providers, DEFAULT_MODEL, question, image_content)
def _call_with_fallback(self, providers: List[VisionProvider], model: str,
question: str, image_content: dict) -> ToolResult:
"""Try each provider in order; fall back to the next one on failure."""
errors: List[str] = []
for i, provider in enumerate(providers):
use_model = provider.model_override or model
try:
logger.info(f"[Vision] Trying provider '{provider.name}' "
f"with model '{use_model}' ({i + 1}/{len(providers)})")
if provider.use_bot:
result = self._call_via_bot(use_model, question, image_content, provider)
else:
result = self._call_api(provider, use_model, question, image_content)
logger.info(f"[Vision] ✅ Success via {provider.name} (model={use_model})")
return result
except VisionAPIError as e:
errors.append(f"[{provider.name}/{use_model}] {e}")
logger.warning(f"[Vision] Provider '{provider.name}' failed: {e}")
except requests.Timeout:
errors.append(f"[{provider.name}/{use_model}] Request timed out after {DEFAULT_TIMEOUT}s")
logger.warning(f"[Vision] Provider '{provider.name}' timed out")
except requests.ConnectionError:
errors.append(f"[{provider.name}/{use_model}] Connection failed")
logger.warning(f"[Vision] Provider '{provider.name}' connection failed")
except Exception as e:
errors.append(f"[{provider.name}/{use_model}] {e}")
logger.error(f"[Vision] Provider '{provider.name}' unexpected error: {e}", exc_info=True)
return ToolResult.fail(
"Error: All Vision API providers failed.\n" + "\n".join(f" - {err}" for err in errors)
)
def _resolve_providers(self) -> List[VisionProvider]:
"""
Build an ordered list of available providers.
Priority:
- use_linkai=true → [LinkAI, MainModel, OtherModels…, OpenAI]
- default → [MainModel, OtherModels…, OpenAI, LinkAI]
"OtherModels" are auto-discovered from configured API keys.
The main model's bot_type is excluded from OtherModels to avoid
duplicating the MainModel provider.
"""
use_linkai = conf().get("use_linkai", False) and conf().get("linkai_api_key")
providers: List[VisionProvider] = []
if use_linkai:
self._append_provider(providers, self._build_linkai_provider)
self._append_provider(providers, self._build_main_model_provider)
self._append_other_model_providers(providers)
self._append_provider(providers, self._build_openai_provider)
else:
self._append_provider(providers, self._build_main_model_provider)
self._append_other_model_providers(providers)
self._append_provider(providers, self._build_openai_provider)
self._append_provider(providers, self._build_linkai_provider)
return providers
@staticmethod
def _append_provider(providers: List[VisionProvider], builder) -> None:
p = builder()
if p:
providers.append(p)
def _append_other_model_providers(self, providers: List[VisionProvider]) -> None:
"""
Auto-discover other models whose API key is configured.
Skip the main model's own bot_type (already covered by MainModel provider).
Skip bot_types that already have a provider in the list (e.g. OpenAI).
"""
# Determine main model's bot_type so we can skip it
main_bot_type = None
if self.model and hasattr(self.model, '_resolve_bot_type'):
main_bot_type = self.model._resolve_bot_type(conf().get("model", ""))
existing_names = {p.name for p in providers}
for config_key, bot_type, default_model, display_name in _DISCOVERABLE_MODELS:
if display_name in existing_names:
continue
if bot_type == main_bot_type:
continue
api_key = conf().get(config_key, "")
if not api_key or not api_key.strip():
continue
# Create a bot instance and check if it supports call_vision
try:
from models.bot_factory import create_bot
bot = create_bot(bot_type)
if not hasattr(bot, 'call_vision'):
continue
except Exception:
continue
providers.append(VisionProvider(
name=display_name,
api_key="",
api_base="",
model_override=default_model,
use_bot=True,
fallback_bot=bot,
))
def _resolve_vision_model(self) -> Optional[str]:
"""
Determine which model to use for vision.
1. User explicit config: tool.vision.model in config.json
2. Fallback to the main configured model name
"""
tool_conf = conf().get("tool", {})
user_vision_model = tool_conf.get("vision", {}).get("model") if isinstance(tool_conf, dict) else None
if user_vision_model:
return user_vision_model
model_name = conf().get("model", "")
return model_name or None
def _build_main_model_provider(self) -> Optional[VisionProvider]:
"""
Use the vendor's own model for vision via bot.call_vision.
Only available when the bot class has call_vision.
"""
if not (self.model and hasattr(self.model, 'bot')):
return None
try:
return self._call_api(api_key, api_base, model, question, image_content, extra_headers)
except requests.Timeout:
return ToolResult.fail(f"Error: Vision API request timed out after {DEFAULT_TIMEOUT}s")
except requests.ConnectionError:
return ToolResult.fail("Error: Failed to connect to Vision API")
except Exception as e:
logger.error(f"[Vision] Unexpected error: {e}", exc_info=True)
return ToolResult.fail(f"Error: Vision API call failed - {e}")
bot = self.model.bot
if not hasattr(bot, 'call_vision'):
return None
except Exception:
return None
def _resolve_provider(self) -> Tuple[Optional[str], str, dict]:
"""Resolve API key, base URL and extra headers. Priority: conf() > env vars."""
vision_model = self._resolve_vision_model()
return VisionProvider(
name=_MAIN_MODEL_PROVIDER_NAME,
api_key="",
api_base="",
model_override=vision_model,
use_bot=True,
)
def _build_openai_provider(self) -> Optional[VisionProvider]:
api_key = conf().get("open_ai_api_key") or os.environ.get("OPENAI_API_KEY")
if api_key:
api_base = (conf().get("open_ai_api_base") or os.environ.get("OPENAI_API_BASE", "")).rstrip("/") \
or "https://api.openai.com/v1"
return api_key, self._ensure_v1(api_base), {}
if not api_key:
return None
api_base = (conf().get("open_ai_api_base") or os.environ.get("OPENAI_API_BASE", "")).rstrip("/") \
or "https://api.openai.com/v1"
return VisionProvider(name="OpenAI", api_key=api_key, api_base=self._ensure_v1(api_base))
def _build_linkai_provider(self) -> Optional[VisionProvider]:
api_key = conf().get("linkai_api_key") or os.environ.get("LINKAI_API_KEY")
if api_key:
api_base = (conf().get("linkai_api_base") or os.environ.get("LINKAI_API_BASE", "")).rstrip("/") \
or "https://api.link-ai.tech"
logger.debug("[Vision] Using LinkAI API (OPENAI_API_KEY not set)")
from common.utils import get_cloud_headers
extra = get_cloud_headers(api_key)
extra.pop("Authorization", None)
extra.pop("Content-Type", None)
return api_key, self._ensure_v1(api_base), extra
if not api_key:
return None
api_base = (conf().get("linkai_api_base") or os.environ.get("LINKAI_API_BASE", "")).rstrip("/") \
or "https://api.link-ai.tech"
from common.utils import get_cloud_headers
extra = get_cloud_headers(api_key)
extra.pop("Authorization", None)
extra.pop("Content-Type", None)
return VisionProvider(name="LinkAI", api_key=api_key, api_base=self._ensure_v1(api_base),
extra_headers=extra)
return None, "", {}
def _call_via_bot(self, model: str, question: str, image_content: dict,
provider: Optional[VisionProvider] = None) -> ToolResult:
"""
Call a model's call_vision with vendor-native API format.
Uses the provider's _fallback_bot if set, otherwise the main model bot.
Raises VisionAPIError on failure so fallback can proceed.
"""
try:
bot = (provider and provider.fallback_bot) or self.model.bot
except Exception as e:
raise VisionAPIError(f"Cannot access bot: {e}")
# Extract the raw image URL from the OpenAI-format image_content block
image_url = image_content.get("image_url", {}).get("url", "")
if not image_url:
raise VisionAPIError("No image URL in content block")
try:
response = bot.call_vision(
image_url=image_url,
question=question,
model=model,
max_tokens=MAX_TOKENS,
)
except Exception as e:
raise VisionAPIError(f"call_vision failed: {e}")
if response is NotImplemented:
raise VisionAPIError("Bot does not support vision")
if isinstance(response, dict) and response.get("error"):
raise VisionAPIError(f"API error - {response.get('message', 'Unknown')}")
content = response.get("content", "") if isinstance(response, dict) else ""
if not content:
raise VisionAPIError("Empty response from main model")
usage_info = response.get("usage", {}) if isinstance(response, dict) else {}
# Use the actual model name from the bot response if available
actual_model = response.get("model", model) if isinstance(response, dict) else model
provider_name = provider.name if provider else _MAIN_MODEL_PROVIDER_NAME
return ToolResult.success({
"model": actual_model,
"provider": provider_name,
"content": content,
"usage": usage_info,
})
@staticmethod
def _ensure_v1(api_base: str) -> str:
@@ -139,9 +353,13 @@ class Vision(BaseTool):
return api_base.rstrip("/") + "/v1"
def _build_image_content(self, image: str) -> dict:
"""Build the image_url content block for the API request."""
"""
Build the image_url content block.
Both remote URLs and local files are converted to base64 data URLs
so every bot backend can consume them without extra downloads.
"""
if image.startswith(("http://", "https://")):
return {"type": "image_url", "image_url": {"url": image}}
return self._download_to_data_url(image)
if not os.path.isfile(image):
raise FileNotFoundError(f"Image file not found: {image}")
@@ -165,6 +383,19 @@ class Vision(BaseTool):
data_url = f"data:{mime_type};base64,{b64}"
return {"type": "image_url", "image_url": {"url": data_url}}
@staticmethod
def _download_to_data_url(url: str) -> dict:
"""Download a remote image and return it as a base64 data URL."""
resp = requests.get(url, timeout=30)
if resp.status_code != 200:
raise VisionAPIError(f"Failed to download image: HTTP {resp.status_code}")
content_type = resp.headers.get("Content-Type", "image/jpeg").split(";")[0].strip()
if not content_type.startswith("image/"):
content_type = "image/jpeg"
b64 = base64.b64encode(resp.content).decode("ascii")
data_url = f"data:{content_type};base64,{b64}"
return {"type": "image_url", "image_url": {"url": data_url}}
@staticmethod
def _maybe_compress(path: str) -> str:
"""Compress image to under COMPRESS_THRESHOLD with max long-edge 1536px."""
@@ -220,8 +451,13 @@ class Vision(BaseTool):
os.remove(tmp.name)
return path
def _call_api(self, api_key: str, api_base: str, model: str,
question: str, image_content: dict, extra_headers: dict = None) -> ToolResult:
def _call_api(self, provider: VisionProvider, model: str,
question: str, image_content: dict) -> ToolResult:
"""
Call a single provider's Vision API.
Raises VisionAPIError on recoverable failures so the caller can try
the next provider.
"""
payload = {
"model": model,
"messages": [
@@ -233,34 +469,29 @@ class Vision(BaseTool):
],
}
],
"max_tokens": MAX_TOKENS,
}
headers = {
"Authorization": f"Bearer {api_key}",
"Authorization": f"Bearer {provider.api_key}",
"Content-Type": "application/json",
**(extra_headers or {}),
**provider.extra_headers,
}
resp = requests.post(
f"{api_base}/chat/completions",
f"{provider.api_base}/chat/completions",
headers=headers,
json=payload,
timeout=DEFAULT_TIMEOUT,
)
if resp.status_code == 401:
return ToolResult.fail("Error: Invalid API key. Please check your configuration.")
if resp.status_code == 429:
return ToolResult.fail("Error: API rate limit reached. Please try again later.")
if resp.status_code != 200:
return ToolResult.fail(f"Error: Vision API returned HTTP {resp.status_code}: {resp.text[:200]}")
raise VisionAPIError(f"HTTP {resp.status_code}: {resp.text[:200]}")
data = resp.json()
if "error" in data:
msg = data["error"].get("message", "Unknown API error")
return ToolResult.fail(f"Error: Vision API error - {msg}")
raise VisionAPIError(f"API error - {msg}")
content = ""
choices = data.get("choices", [])
@@ -270,6 +501,7 @@ class Vision(BaseTool):
usage = data.get("usage", {})
result = {
"model": model,
"provider": provider.name,
"content": content,
"usage": {
"prompt_tokens": usage.get("prompt_tokens", 0),

View File

@@ -67,7 +67,7 @@ class AgentLLMModel(LLMModel):
_MODEL_BOT_TYPE_MAP = {
"wenxin": const.BAIDU, "wenxin-4": const.BAIDU,
"xunfei": const.XUNFEI, const.QWEN: const.QWEN,
"xunfei": const.XUNFEI, const.QWEN: const.QWEN_DASHSCOPE,
const.MODELSCOPE: const.MODELSCOPE,
}
_MODEL_PREFIX_MAP = [
@@ -124,14 +124,15 @@ class AgentLLMModel(LLMModel):
@property
def bot(self):
"""Lazy load the bot, re-create when model changes"""
"""Lazy load the bot, re-create when model or bot_type changes"""
from models.bot_factory import create_bot
cur_model = self.model
if self._bot is None or self._bot_model != cur_model:
bot_type = self._resolve_bot_type(cur_model)
self._bot = create_bot(bot_type)
cur_bot_type = self._resolve_bot_type(cur_model)
if self._bot is None or self._bot_model != cur_model or getattr(self, '_bot_type', None) != cur_bot_type:
self._bot = create_bot(cur_bot_type)
self._bot = add_openai_compatible_support(self._bot)
self._bot_model = cur_model
self._bot_type = cur_bot_type
return self._bot
def call(self, request: LLMRequest):
@@ -505,15 +506,15 @@ class AgentBridge:
def _migrate_config_to_env(self, workspace_root: str):
"""
Migrate API keys from config.json to .env file if not already set
Sync API keys from config.json to .env file.
Adds new keys and updates changed values on each startup.
Args:
workspace_root: Workspace directory path (not used, kept for compatibility)
"""
from config import conf
import os
# Mapping from config.json keys to environment variable names
key_mapping = {
"open_ai_api_key": "OPENAI_API_KEY",
"open_ai_api_base": "OPENAI_API_BASE",
@@ -522,10 +523,9 @@ class AgentBridge:
"linkai_api_key": "LINKAI_API_KEY",
}
# Use fixed secure location for .env file
env_file = expand_path("~/.cow/.env")
# Read existing env vars from .env file
# Read existing env vars (key -> value)
existing_env_vars = {}
if os.path.exists(env_file):
try:
@@ -533,48 +533,46 @@ class AgentBridge:
for line in f:
line = line.strip()
if line and not line.startswith('#') and '=' in line:
key, _ = line.split('=', 1)
existing_env_vars[key.strip()] = True
key, val = line.split('=', 1)
existing_env_vars[key.strip()] = val.strip()
except Exception as e:
logger.warning(f"[AgentBridge] Failed to read .env file: {e}")
# Check which keys need to be migrated
keys_to_migrate = {}
# Sync config.json values into .env (add/update/remove)
updated = False
for config_key, env_key in key_mapping.items():
# Skip if already in .env file
if env_key in existing_env_vars:
continue
# Get value from config.json
value = conf().get(config_key, "")
if value and value.strip(): # Only migrate non-empty values
keys_to_migrate[env_key] = value.strip()
# Log summary if there are keys to skip
if existing_env_vars:
logger.debug(f"[AgentBridge] {len(existing_env_vars)} env vars already in .env")
# Write new keys to .env file
if keys_to_migrate:
raw = conf().get(config_key, "")
value = raw.strip() if raw else ""
old_value = existing_env_vars.get(env_key)
if value:
if old_value == value:
continue
existing_env_vars[env_key] = value
os.environ[env_key] = value
updated = True
else:
if old_value is None:
continue
existing_env_vars.pop(env_key, None)
os.environ.pop(env_key, None)
updated = True
updated = True
if updated:
try:
# Ensure ~/.cow directory and .env file exist
env_dir = os.path.dirname(env_file)
if not os.path.exists(env_dir):
os.makedirs(env_dir, exist_ok=True)
if not os.path.exists(env_file):
open(env_file, 'a').close()
# Append new keys
with open(env_file, 'a', encoding='utf-8') as f:
f.write('\n# Auto-migrated from config.json\n')
for key, value in keys_to_migrate.items():
os.makedirs(env_dir, exist_ok=True)
with open(env_file, 'w', encoding='utf-8') as f:
f.write('# Environment variables for agent\n')
f.write('# Auto-managed - synced from config.json on startup\n\n')
for key, value in sorted(existing_env_vars.items()):
f.write(f'{key}={value}\n')
# Also set in current process
os.environ[key] = value
logger.info(f"[AgentBridge] Migrated {len(keys_to_migrate)} API keys from config.json to .env: {list(keys_to_migrate.keys())}")
logger.info(f"[AgentBridge] Synced API keys from config.json to .env")
except Exception as e:
logger.warning(f"[AgentBridge] Failed to migrate API keys: {e}")
logger.warning(f"[AgentBridge] Failed to sync API keys: {e}")
def _persist_messages(
self, session_id: str, new_messages: list, channel_type: str = ""

View File

@@ -465,8 +465,12 @@ class AgentInitializer:
'timezone': timezone_name
}
def get_model():
"""Get current model name dynamically from config"""
return conf().get("model", "unknown")
return {
"model": conf().get("model", "unknown"),
"_get_model": get_model,
"workspace": workspace_root,
"channel": ", ".join(conf().get("channel_type")) if isinstance(conf().get("channel_type"), list) else conf().get("channel_type", "unknown"),
"_get_current_time": get_current_time # Dynamic time function
@@ -486,7 +490,7 @@ class AgentInitializer:
env_file = expand_path("~/.cow/.env")
# Read existing env vars
# Read existing env vars (key -> value)
existing_env_vars = {}
if os.path.exists(env_file):
try:
@@ -494,38 +498,46 @@ class AgentInitializer:
for line in f:
line = line.strip()
if line and not line.startswith('#') and '=' in line:
key, _ = line.split('=', 1)
existing_env_vars[key.strip()] = True
key, val = line.split('=', 1)
existing_env_vars[key.strip()] = val.strip()
except Exception as e:
logger.warning(f"[AgentInitializer] Failed to read .env file: {e}")
# Check which keys need migration
keys_to_migrate = {}
# Sync config.json values into .env (add/update/remove)
updated = False
for config_key, env_key in key_mapping.items():
if env_key in existing_env_vars:
continue
value = conf().get(config_key, "")
if value and value.strip():
keys_to_migrate[env_key] = value.strip()
# Write new keys
if keys_to_migrate:
raw = conf().get(config_key, "")
value = raw.strip() if raw else ""
old_value = existing_env_vars.get(env_key)
if value:
if old_value == value:
continue
existing_env_vars[env_key] = value
os.environ[env_key] = value
updated = True
else:
if old_value is None:
continue
existing_env_vars.pop(env_key, None)
os.environ.pop(env_key, None)
updated = True
if updated:
try:
env_dir = os.path.dirname(env_file)
if not os.path.exists(env_dir):
os.makedirs(env_dir, exist_ok=True)
if not os.path.exists(env_file):
open(env_file, 'a').close()
with open(env_file, 'a', encoding='utf-8') as f:
f.write('\n# Auto-migrated from config.json\n')
for key, value in keys_to_migrate.items():
os.makedirs(env_dir, exist_ok=True)
# Rewrite the entire .env file to ensure consistency
with open(env_file, 'w', encoding='utf-8') as f:
f.write('# Environment variables for agent\n')
f.write('# Auto-managed - synced from config.json on startup\n\n')
for key, value in sorted(existing_env_vars.items()):
f.write(f'{key}={value}\n')
os.environ[key] = value
logger.info(f"[AgentInitializer] Migrated {len(keys_to_migrate)} API keys to .env: {list(keys_to_migrate.keys())}")
logger.info(f"[AgentInitializer] Synced API keys from config.json to .env")
except Exception as e:
logger.warning(f"[AgentInitializer] Failed to migrate API keys: {e}")
logger.warning(f"[AgentInitializer] Failed to sync API keys: {e}")
def _start_daily_flush_timer(self):
"""Start a background thread that flushes all agents' memory daily at 23:55."""

View File

@@ -39,11 +39,8 @@ class Bridge(object):
self.btype["chat"] = const.BAIDU
if model_type in ["xunfei"]:
self.btype["chat"] = const.XUNFEI
if model_type in [const.QWEN]:
self.btype["chat"] = const.QWEN
if model_type in [const.QWEN_TURBO, const.QWEN_PLUS, const.QWEN_MAX]:
if model_type in [const.QWEN, const.QWEN_TURBO, const.QWEN_PLUS, const.QWEN_MAX]:
self.btype["chat"] = const.QWEN_DASHSCOPE
# Support Qwen3 and other DashScope models
if model_type and (model_type.startswith("qwen") or model_type.startswith("qwq") or model_type.startswith("qvq")):
self.btype["chat"] = const.QWEN_DASHSCOPE
if model_type and model_type.startswith("gemini"):

View File

@@ -347,38 +347,30 @@ class ChatChannel(Channel):
if media_items:
logger.info(f"[chat_channel] Extracted {len(media_items)} media item(s) from reply")
# 先发送文本(保持原文本不变)
# Send text first (the frontend will embed video players via renderMarkdown).
logger.info(f"[chat_channel] Sending text content before media: {reply.content[:100]}...")
self._send(reply, context)
logger.info(f"[chat_channel] Text sent, now sending {len(media_items)} media item(s)")
# 然后逐个发送媒体文件
for i, (url, media_type) in enumerate(media_items):
try:
# 判断是本地文件还是URL
# Determine whether it is a remote URL or a local file.
if url.startswith(('http://', 'https://')):
# 网络资源
if media_type == 'video':
# 视频使用 FILE 类型发送
media_reply = Reply(ReplyType.FILE, url)
media_reply.file_name = os.path.basename(url)
else:
# 图片使用 IMAGE_URL 类型
media_reply = Reply(ReplyType.IMAGE_URL, url)
elif os.path.exists(url):
# 本地文件
if media_type == 'video':
# 视频使用 FILE 类型,转换为 file:// URL
media_reply = Reply(ReplyType.FILE, f"file://{url}")
media_reply.file_name = os.path.basename(url)
else:
# 图片使用 IMAGE_URL 类型,转换为 file:// URL
media_reply = Reply(ReplyType.IMAGE_URL, f"file://{url}")
else:
logger.warning(f"[chat_channel] Media file not found or invalid URL: {url}")
continue
# 发送媒体文件(添加小延迟避免频率限制)
if i > 0:
time.sleep(0.5)
self._send(media_reply, context)

View File

@@ -270,8 +270,42 @@ function createMd() {
const md = createMd();
const VIDEO_EXT_RE = /\.(?:mp4|webm|mov|avi|mkv)$/i; // tested against URL without query string
function _buildVideoHtml(url) {
const fileName = url.split('/').pop().split('?')[0];
return `<div style="margin:10px 0;">` +
`<video controls preload="metadata" ` +
`style="max-width:100%;border-radius:10px;box-shadow:0 2px 8px rgba(0,0,0,0.15);display:block;">` +
`<source src="${url}"></video>` +
`<a href="${url}" target="_blank" ` +
`style="display:inline-flex;align-items:center;gap:4px;margin-top:4px;font-size:12px;color:#8b8fa8;text-decoration:none;">` +
`<i class="fas fa-download"></i> ${escapeHtml(fileName)}</a></div>`;
}
function injectVideoPlayers(html) {
// Step 1: replace markdown-it anchor tags whose href points to a video file.
const step1 = html.replace(
/<a\s+href="(https?:\/\/[^"]+)"[^>]*>[^<]*<\/a>/gi,
(match, url) => VIDEO_EXT_RE.test(url.split('?')[0]) ? _buildVideoHtml(url) : match
);
// Step 2: replace any remaining bare video URLs in text nodes (not inside HTML tags).
// Split on HTML tags to avoid touching src/href attributes already in markup.
return step1.split(/(<[^>]+>)/).map((chunk, idx) => {
// Even indices are text nodes; odd indices are HTML tags — leave them untouched.
if (idx % 2 !== 0) return chunk;
return chunk.replace(/https?:\/\/\S+/gi, (url) => {
const bare = url.replace(/[),.\s]+$/, ''); // strip trailing punctuation
return VIDEO_EXT_RE.test(bare.split('?')[0]) ? _buildVideoHtml(bare) : url;
});
}).join('');
}
function renderMarkdown(text) {
try { return md.render(text); }
try {
const html = md.render(text);
return injectVideoPlayers(html);
}
catch (e) { return text.replace(/\n/g, '<br>'); }
}
@@ -729,41 +763,60 @@ function sendMessage() {
}));
}
fetch('/message', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(body)
})
.then(r => r.json())
.then(data => {
if (data.status === 'success') {
if (data.stream) {
startSSE(data.request_id, loadingEl, timestamp);
const MAX_RETRIES = 2;
const RETRY_DELAY_MS = 1000;
function postWithRetry(attempt) {
fetch('/message', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(body)
})
.then(r => r.json())
.then(data => {
if (data.status === 'success') {
if (data.stream) {
startSSE(data.request_id, loadingEl, timestamp);
} else {
loadingContainers[data.request_id] = loadingEl;
if (!isPolling) startPolling();
}
} else {
loadingContainers[data.request_id] = loadingEl;
if (!isPolling) startPolling();
loadingEl.remove();
addBotMessage(t('error_send'), new Date());
}
})
.catch(err => {
if (err.name === 'AbortError') {
loadingEl.remove();
addBotMessage(t('error_timeout'), new Date());
return;
}
if (attempt < MAX_RETRIES) {
console.warn(`[sendMessage] attempt ${attempt + 1} failed, retrying...`, err);
setTimeout(() => postWithRetry(attempt + 1), RETRY_DELAY_MS * (attempt + 1));
return;
}
} else {
loadingEl.remove();
addBotMessage(t('error_send'), new Date());
}
})
.catch(err => {
loadingEl.remove();
addBotMessage(err.name === 'AbortError' ? t('error_timeout') : t('error_send'), new Date());
});
});
}
postWithRetry(0);
}
function startSSE(requestId, loadingEl, timestamp) {
const es = new EventSource(`/stream?request_id=${encodeURIComponent(requestId)}`);
activeStreams[requestId] = es;
let botEl = null;
let stepsEl = null; // .agent-steps (thinking summaries + tool indicators)
let contentEl = null; // .answer-content (final streaming answer)
let mediaEl = null; // .media-content (images & file attachments)
let accumulatedText = '';
let currentToolEl = null;
let done = false;
const MAX_RECONNECTS = 10;
const RECONNECT_BASE_MS = 1000;
let reconnectCount = 0;
function ensureBotEl() {
if (botEl) return;
@@ -788,162 +841,204 @@ function startSSE(requestId, loadingEl, timestamp) {
mediaEl = botEl.querySelector('.media-content');
}
es.onmessage = function(e) {
let item;
try { item = JSON.parse(e.data); } catch (_) { return; }
function connect() {
const es = new EventSource(`/stream?request_id=${encodeURIComponent(requestId)}`);
activeStreams[requestId] = es;
if (item.type === 'delta') {
ensureBotEl();
accumulatedText += item.content;
contentEl.innerHTML = renderMarkdown(accumulatedText);
scrollChatToBottom();
es.onmessage = function(e) {
let item;
try { item = JSON.parse(e.data); } catch (_) { return; }
} else if (item.type === 'tool_start') {
ensureBotEl();
// Successful data received, reset reconnect counter
reconnectCount = 0;
// Save current thinking as a collapsible step
if (accumulatedText.trim()) {
const fullText = accumulatedText.trim();
const oneLine = fullText.replace(/\n+/g, ' ');
const needsTruncate = oneLine.length > 80;
const stepEl = document.createElement('div');
stepEl.className = 'agent-step agent-thinking-step' + (needsTruncate ? '' : ' no-expand');
if (needsTruncate) {
const truncated = oneLine.substring(0, 80) + '…';
stepEl.innerHTML = `
<div class="thinking-header" onclick="this.parentElement.classList.toggle('expanded')">
<i class="fas fa-lightbulb text-amber-400 flex-shrink-0"></i>
<span class="thinking-summary">${escapeHtml(truncated)}</span>
<i class="fas fa-chevron-right thinking-chevron"></i>
</div>
<div class="thinking-full">${renderMarkdown(fullText)}</div>`;
} else {
stepEl.innerHTML = `
<div class="thinking-header no-toggle">
<i class="fas fa-lightbulb text-amber-400 flex-shrink-0"></i>
<span>${escapeHtml(oneLine)}</span>
</div>`;
if (item.type === 'delta') {
ensureBotEl();
accumulatedText += item.content;
contentEl.innerHTML = renderMarkdown(accumulatedText);
scrollChatToBottom();
} else if (item.type === 'tool_start') {
ensureBotEl();
// Save current thinking as a collapsible step
if (accumulatedText.trim()) {
const fullText = accumulatedText.trim();
const oneLine = fullText.replace(/\n+/g, ' ');
const needsTruncate = oneLine.length > 80;
const stepEl = document.createElement('div');
stepEl.className = 'agent-step agent-thinking-step' + (needsTruncate ? '' : ' no-expand');
if (needsTruncate) {
const truncated = oneLine.substring(0, 80) + '…';
stepEl.innerHTML = `
<div class="thinking-header" onclick="this.parentElement.classList.toggle('expanded')">
<i class="fas fa-lightbulb text-amber-400 flex-shrink-0"></i>
<span class="thinking-summary">${escapeHtml(truncated)}</span>
<i class="fas fa-chevron-right thinking-chevron"></i>
</div>
<div class="thinking-full">${renderMarkdown(fullText)}</div>`;
} else {
stepEl.innerHTML = `
<div class="thinking-header no-toggle">
<i class="fas fa-lightbulb text-amber-400 flex-shrink-0"></i>
<span>${escapeHtml(oneLine)}</span>
</div>`;
}
stepsEl.appendChild(stepEl);
}
stepsEl.appendChild(stepEl);
}
accumulatedText = '';
contentEl.innerHTML = '';
accumulatedText = '';
contentEl.innerHTML = '';
// Add tool execution indicator (collapsible)
currentToolEl = document.createElement('div');
currentToolEl.className = 'agent-step agent-tool-step';
const argsStr = formatToolArgs(item.arguments || {});
currentToolEl.innerHTML = `
<div class="tool-header" onclick="this.parentElement.classList.toggle('expanded')">
<i class="fas fa-cog fa-spin text-primary-400 flex-shrink-0 tool-icon"></i>
<span class="tool-name">${item.tool}</span>
<i class="fas fa-chevron-right tool-chevron"></i>
</div>
<div class="tool-detail">
<div class="tool-detail-section">
<div class="tool-detail-label">Input</div>
<pre class="tool-detail-content">${argsStr}</pre>
// Add tool execution indicator (collapsible)
currentToolEl = document.createElement('div');
currentToolEl.className = 'agent-step agent-tool-step';
const argsStr = formatToolArgs(item.arguments || {});
currentToolEl.innerHTML = `
<div class="tool-header" onclick="this.parentElement.classList.toggle('expanded')">
<i class="fas fa-cog fa-spin text-primary-400 flex-shrink-0 tool-icon"></i>
<span class="tool-name">${item.tool}</span>
<i class="fas fa-chevron-right tool-chevron"></i>
</div>
<div class="tool-detail-section tool-output-section"></div>
</div>`;
stepsEl.appendChild(currentToolEl);
<div class="tool-detail">
<div class="tool-detail-section">
<div class="tool-detail-label">Input</div>
<pre class="tool-detail-content">${argsStr}</pre>
</div>
<div class="tool-detail-section tool-output-section"></div>
</div>`;
stepsEl.appendChild(currentToolEl);
scrollChatToBottom();
scrollChatToBottom();
} else if (item.type === 'tool_end') {
if (currentToolEl) {
const isError = item.status !== 'success';
const icon = currentToolEl.querySelector('.tool-icon');
icon.className = isError
? 'fas fa-times text-red-400 flex-shrink-0 tool-icon'
: 'fas fa-check text-primary-400 flex-shrink-0 tool-icon';
} else if (item.type === 'tool_end') {
if (currentToolEl) {
const isError = item.status !== 'success';
const icon = currentToolEl.querySelector('.tool-icon');
icon.className = isError
? 'fas fa-times text-red-400 flex-shrink-0 tool-icon'
: 'fas fa-check text-primary-400 flex-shrink-0 tool-icon';
// Show execution time
const nameEl = currentToolEl.querySelector('.tool-name');
if (item.execution_time !== undefined) {
nameEl.innerHTML += ` <span class="tool-time">${item.execution_time}s</span>`;
// Show execution time
const nameEl = currentToolEl.querySelector('.tool-name');
if (item.execution_time !== undefined) {
nameEl.innerHTML += ` <span class="tool-time">${item.execution_time}s</span>`;
}
// Fill output section
const outputSection = currentToolEl.querySelector('.tool-output-section');
if (outputSection && item.result) {
outputSection.innerHTML = `
<div class="tool-detail-label">${isError ? 'Error' : 'Output'}</div>
<pre class="tool-detail-content ${isError ? 'tool-error-text' : ''}">${escapeHtml(String(item.result))}</pre>`;
}
if (isError) currentToolEl.classList.add('tool-failed');
currentToolEl = null;
}
// Fill output section
const outputSection = currentToolEl.querySelector('.tool-output-section');
if (outputSection && item.result) {
outputSection.innerHTML = `
<div class="tool-detail-label">${isError ? 'Error' : 'Output'}</div>
<pre class="tool-detail-content ${isError ? 'tool-error-text' : ''}">${escapeHtml(String(item.result))}</pre>`;
}
} else if (item.type === 'image') {
ensureBotEl();
const imgEl = document.createElement('img');
imgEl.src = item.content;
imgEl.alt = 'screenshot';
imgEl.style.cssText = 'max-width:600px;border-radius:8px;margin:8px 0;cursor:pointer;box-shadow:0 1px 4px rgba(0,0,0,0.1);';
imgEl.onclick = () => window.open(item.content, '_blank');
mediaEl.appendChild(imgEl);
scrollChatToBottom();
if (isError) currentToolEl.classList.add('tool-failed');
currentToolEl = null;
} else if (item.type === 'text') {
// Intermediate text sent before media items; display it but keep SSE open.
ensureBotEl();
contentEl.classList.remove('sse-streaming');
const textContent = item.content || accumulatedText;
if (textContent) contentEl.innerHTML = renderMarkdown(textContent);
applyHighlighting(botEl);
scrollChatToBottom();
} else if (item.type === 'video') {
ensureBotEl();
const wrapper = document.createElement('div');
wrapper.innerHTML = _buildVideoHtml(item.content);
mediaEl.appendChild(wrapper.firstElementChild || wrapper);
scrollChatToBottom();
} else if (item.type === 'file') {
ensureBotEl();
const fileName = item.file_name || item.content.split('/').pop();
const fileEl = document.createElement('a');
fileEl.href = item.content;
fileEl.download = fileName;
fileEl.target = '_blank';
fileEl.className = 'file-attachment';
fileEl.style.cssText = 'display:inline-flex;align-items:center;gap:6px;padding:8px 14px;margin:8px 0;border-radius:8px;background:var(--bg-secondary,#f3f4f6);color:var(--text-primary,#374151);text-decoration:none;font-size:14px;border:1px solid var(--border-color,#e5e7eb);';
fileEl.innerHTML = `<i class="fas fa-file-download" style="color:#6b7280;"></i> ${fileName}`;
mediaEl.appendChild(fileEl);
scrollChatToBottom();
} else if (item.type === 'phase') {
// Coarse progress (e.g. cow install-browser); must not close SSE (unlike "done")
ensureBotEl();
const wrap = document.createElement('div');
wrap.className = 'text-xs sm:text-sm text-slate-600 dark:text-slate-400 border-l-2 border-primary-400 pl-2 py-1 my-0.5';
wrap.textContent = String(item.content || '');
stepsEl.appendChild(wrap);
scrollChatToBottom();
} else if (item.type === 'done') {
done = true;
es.close();
delete activeStreams[requestId];
// item.content may be empty when "done" is only a stream-close signal after media.
const finalText = item.content || accumulatedText;
if (!botEl && finalText) {
if (loadingEl) { loadingEl.remove(); loadingEl = null; }
addBotMessage(finalText, new Date((item.timestamp || Date.now() / 1000) * 1000), requestId);
} else if (botEl) {
contentEl.classList.remove('sse-streaming');
// Only update text content when there is something new to show.
if (finalText) contentEl.innerHTML = renderMarkdown(finalText);
applyHighlighting(botEl);
}
scrollChatToBottom();
} else if (item.type === 'error') {
done = true;
es.close();
delete activeStreams[requestId];
if (loadingEl) { loadingEl.remove(); loadingEl = null; }
addBotMessage(t('error_send'), new Date());
}
};
} else if (item.type === 'image') {
ensureBotEl();
const imgEl = document.createElement('img');
imgEl.src = item.content;
imgEl.alt = 'screenshot';
imgEl.style.cssText = 'max-width:600px;border-radius:8px;margin:8px 0;cursor:pointer;box-shadow:0 1px 4px rgba(0,0,0,0.1);';
imgEl.onclick = () => window.open(item.content, '_blank');
mediaEl.appendChild(imgEl);
scrollChatToBottom();
} else if (item.type === 'file') {
ensureBotEl();
const fileName = item.file_name || item.content.split('/').pop();
const fileEl = document.createElement('a');
fileEl.href = item.content;
fileEl.download = fileName;
fileEl.target = '_blank';
fileEl.className = 'file-attachment';
fileEl.style.cssText = 'display:inline-flex;align-items:center;gap:6px;padding:8px 14px;margin:8px 0;border-radius:8px;background:var(--bg-secondary,#f3f4f6);color:var(--text-primary,#374151);text-decoration:none;font-size:14px;border:1px solid var(--border-color,#e5e7eb);';
fileEl.innerHTML = `<i class="fas fa-file-download" style="color:#6b7280;"></i> ${fileName}`;
mediaEl.appendChild(fileEl);
scrollChatToBottom();
} else if (item.type === 'phase') {
// Coarse progress (e.g. cow install-browser); must not close SSE (unlike "done")
ensureBotEl();
const wrap = document.createElement('div');
wrap.className = 'text-xs sm:text-sm text-slate-600 dark:text-slate-400 border-l-2 border-primary-400 pl-2 py-1 my-0.5';
wrap.textContent = String(item.content || '');
stepsEl.appendChild(wrap);
scrollChatToBottom();
} else if (item.type === 'done') {
es.onerror = function() {
es.close();
delete activeStreams[requestId];
const finalText = item.content || accumulatedText;
if (done) return;
if (!botEl && finalText) {
if (loadingEl) { loadingEl.remove(); loadingEl = null; }
addBotMessage(finalText, new Date((item.timestamp || Date.now() / 1000) * 1000), requestId);
} else if (botEl) {
if (reconnectCount < MAX_RECONNECTS) {
reconnectCount++;
const delay = Math.min(RECONNECT_BASE_MS * reconnectCount, 5000);
console.warn(`[SSE] connection lost for ${requestId}, reconnecting in ${delay}ms (attempt ${reconnectCount}/${MAX_RECONNECTS})`);
setTimeout(connect, delay);
return;
}
// Exhausted retries, show whatever we have
if (loadingEl) { loadingEl.remove(); loadingEl = null; }
if (!botEl) {
addBotMessage(t('error_send'), new Date());
} else if (accumulatedText) {
contentEl.classList.remove('sse-streaming');
if (finalText) contentEl.innerHTML = renderMarkdown(finalText);
contentEl.innerHTML = renderMarkdown(accumulatedText);
applyHighlighting(botEl);
}
scrollChatToBottom();
};
}
} else if (item.type === 'error') {
es.close();
delete activeStreams[requestId];
if (loadingEl) { loadingEl.remove(); loadingEl = null; }
addBotMessage(t('error_send'), new Date());
}
};
es.onerror = function() {
es.close();
delete activeStreams[requestId];
if (loadingEl) { loadingEl.remove(); loadingEl = null; }
if (!botEl) {
addBotMessage(t('error_send'), new Date());
} else if (accumulatedText) {
contentEl.classList.remove('sse-streaming');
contentEl.innerHTML = renderMarkdown(accumulatedText);
applyHighlighting(botEl);
}
};
connect();
}
function startPolling() {

View File

@@ -126,6 +126,13 @@ class WebChannel(ChatChannel):
logger.debug(f"SSE skipped duplicate file for request {request_id}")
return
# Skip http-URL FILE/IMAGE_URL replies produced by chat_channel's media extraction:
# the text reply (already sent as "done") contains the URL and the frontend will
# render it via renderMarkdown/injectVideoPlayers, so no separate SSE event needed.
if reply.type in (ReplyType.FILE, ReplyType.IMAGE_URL) and content.startswith(("http://", "https://")):
logger.debug(f"SSE skipped http media reply for request {request_id}")
return
self.sse_queues[request_id].put({
"type": "done",
"content": content,
@@ -322,14 +329,18 @@ class WebChannel(ChatChannel):
"""
SSE generator for a given request_id.
Yields UTF-8 encoded bytes to avoid WSGI Latin-1 mangling.
Supports client reconnection: the queue is only removed after a
"done" event is consumed, so a new GET /stream with the same
request_id can resume reading remaining events.
"""
if request_id not in self.sse_queues:
yield b"data: {\"type\": \"error\", \"message\": \"invalid request_id\"}\n\n"
return
q = self.sse_queues[request_id]
timeout = 300 # 5 minutes max
deadline = time.time() + timeout
idle_timeout = 600 # 10 minutes without any real event
deadline = time.time() + idle_timeout
done = False
try:
while time.time() < deadline:
@@ -339,13 +350,18 @@ class WebChannel(ChatChannel):
yield b": keepalive\n\n"
continue
# Real event received, reset idle deadline
deadline = time.time() + idle_timeout
payload = json.dumps(item, ensure_ascii=False)
yield f"data: {payload}\n\n".encode("utf-8")
if item.get("type") == "done":
done = True
break
finally:
self.sse_queues.pop(request_id, None)
if done:
self.sse_queues.pop(request_id, None)
def poll_response(self):
"""
@@ -447,8 +463,14 @@ class WebChannel(ChatChannel):
func = web.httpserver.StaticMiddleware(app.wsgifunc())
func = web.httpserver.LogMiddleware(func)
server = web.httpserver.WSGIServer(("0.0.0.0", port), func)
# Allow concurrent requests by not blocking on in-flight handler threads
server.daemon_threads = True
# Default request_queue_size(5) / timeout(10s) / numthreads(10) are
# too small: when SSE streams occupy many threads, the backlog fills
# and new connections get refused (ERR_CONNECTION_ABORTED).
server.request_queue_size = 128
server.timeout = 300
server.requests.min = 20
server.requests.max = 80
self._http_server = server
try:
server.start()
@@ -563,7 +585,7 @@ class ConfigHandler:
_RECOMMENDED_MODELS = [
const.MINIMAX_M2_7, const.MINIMAX_M2_5, const.MINIMAX_M2_1, const.MINIMAX_M2_1_LIGHTNING,
const.GLM_5_TURBO, const.GLM_5, const.GLM_4_7,
const.QWEN3_MAX, const.QWEN35_PLUS,
const.QWEN36_PLUS, const.QWEN35_PLUS, const.QWEN3_MAX,
const.KIMI_K2_5, const.KIMI_K2,
const.DOUBAO_SEED_2_PRO, const.DOUBAO_SEED_2_CODE,
const.CLAUDE_4_6_SONNET, const.CLAUDE_4_6_OPUS, const.CLAUDE_4_5_SONNET,
@@ -592,7 +614,7 @@ class ConfigHandler:
"api_key_field": "dashscope_api_key",
"api_base_key": None,
"api_base_default": None,
"models": [const.QWEN3_MAX, const.QWEN35_PLUS],
"models": [const.QWEN36_PLUS, const.QWEN35_PLUS, const.QWEN3_MAX],
}),
("moonshot", {
"label": "Kimi",

View File

@@ -37,11 +37,19 @@ def _random_wechat_uin() -> str:
return base64.b64encode(str(val).encode("utf-8")).decode("utf-8")
CHANNEL_VERSION = "2.0.0"
# iLink-App-ClientVersion: uint32 encoded as major<<16 | minor<<8 | patch
# 2.0.0 → 0x00020000 = 131072
CLIENT_VERSION = "131072"
def _build_headers(token: str = "") -> dict:
headers = {
"Content-Type": "application/json",
"AuthorizationType": "ilink_bot_token",
"X-WECHAT-UIN": _random_wechat_uin(),
"iLink-App-Id": "bot",
"iLink-App-ClientVersion": CLIENT_VERSION,
}
if token:
headers["Authorization"] = f"Bearer {token}"
@@ -64,6 +72,7 @@ class WeixinApi:
def _post(self, endpoint: str, body: dict, timeout: int = DEFAULT_API_TIMEOUT) -> dict:
url = _ensure_trailing_slash(self.base_url) + endpoint
headers = _build_headers(self.token)
body.setdefault("base_info", {}).setdefault("channel_version", CHANNEL_VERSION)
try:
resp = requests.post(url, json=body, headers=headers, timeout=timeout)
resp.raise_for_status()
@@ -210,7 +219,10 @@ class WeixinApi:
def poll_qr_status(self, qrcode: str, timeout: int = QR_POLL_TIMEOUT) -> dict:
url = (_ensure_trailing_slash(self.base_url) +
f"ilink/bot/get_qrcode_status?qrcode={requests.utils.quote(qrcode)}")
headers = {"iLink-App-ClientVersion": "1"}
headers = {
"iLink-App-Id": "bot",
"iLink-App-ClientVersion": CLIENT_VERSION,
}
try:
resp = requests.get(url, headers=headers, timeout=timeout)
resp.raise_for_status()

View File

@@ -166,10 +166,18 @@ class WeixinChannel(ChatChannel):
print("=" * 60)
try:
import qrcode as qr_lib
import io
qr = qr_lib.QRCode(error_correction=qr_lib.constants.ERROR_CORRECT_L, box_size=1, border=1)
qr.add_data(qrcode_url)
qr.make(fit=True)
qr.print_ascii(invert=True)
buf = io.StringIO()
qr.print_ascii(out=buf, invert=True)
try:
print(buf.getvalue())
except UnicodeEncodeError:
# Windows GBK terminals cannot render Unicode block characters
print(f"\n (终端不支持显示二维码,请使用链接扫码)")
print(f" 二维码链接: {qrcode_url}\n")
except ImportError:
print(f"\n 二维码链接: {qrcode_url}")
print(" (安装 'qrcode' 包可在终端显示二维码)\n")

View File

@@ -178,7 +178,10 @@ def update(ctx):
"""Update CowAgent and restart."""
root = get_project_root()
# 1. Git pull while service is still running
# 1. Stop service first so git pull won't conflict with running code
ctx.invoke(stop)
# 2. Git pull
if os.path.isdir(os.path.join(root, ".git")):
click.echo("Pulling latest code...")
ret = subprocess.call(["git", "pull"], cwd=root)
@@ -188,28 +191,61 @@ def update(ctx):
else:
click.echo("Not a git repository, skipping code update.")
# 2. Stop service
ctx.invoke(stop)
# 3. Install dependencies
python = sys.executable
req_file = os.path.join(root, "requirements.txt")
if os.path.exists(req_file):
click.echo("Installing dependencies...")
subprocess.call(
[python, "-m", "pip", "install", "-r", "requirements.txt", "-q"],
if _IS_WIN:
# On Windows, `cow.exe` (this process) locks the exe file, so
# `pip install -e .` fails with WinError 5. Write a small .bat
# helper that waits for cow.exe to exit, then installs & starts.
bat = os.path.join(root, "_cow_update.bat")
lines = [
"@echo off",
"chcp 65001 >nul",
"echo Waiting for cow.exe to exit...",
"timeout /t 3 /nobreak >nul",
]
if os.path.exists(req_file):
lines.append(f'echo Installing dependencies...')
lines.append(f'"{python}" -m pip install -r requirements.txt -q')
lines += [
"echo Reinstalling cow CLI...",
f'"{python}" -m pip install -e . -q',
"echo Starting CowAgent...",
f'"{python}" -m cli.cli start --no-logs',
"echo.",
"echo Update complete. You can close this window.",
"pause >nul",
"del \"%~f0\"",
]
with open(bat, "w", encoding="utf-8") as f:
f.write("\n".join(lines) + "\n")
subprocess.Popen(
["cmd.exe", "/c", "start", "CowAgent Update", "/wait", bat],
cwd=root,
)
click.echo(click.style(
"✓ Update script launched. Please follow the new window for progress.",
fg="green"))
else:
# 3. Install dependencies
if os.path.exists(req_file):
click.echo("Installing dependencies...")
subprocess.call(
[python, "-m", "pip", "install", "-r", "requirements.txt", "-q"],
cwd=root,
)
click.echo("Reinstalling cow CLI...")
subprocess.call(
[python, "-m", "pip", "install", "-e", ".", "-q"],
cwd=root,
)
click.echo("Reinstalling cow CLI...")
subprocess.call(
[python, "-m", "pip", "install", "-e", ".", "-q"],
cwd=root,
)
# 4. Start service and follow logs
click.echo("")
time.sleep(1)
ctx.invoke(start, no_logs=False)
# 4. Start service
click.echo("")
time.sleep(1)
ctx.invoke(start, no_logs=False)
@click.command()

View File

@@ -47,8 +47,8 @@ CREDENTIAL_MAP = {
class CloudClient(LinkAIClient):
def __init__(self, api_key: str, channel, host: str = ""):
super().__init__(api_key, host)
def __init__(self, api_key: str, channel, host: str = "", port=None):
super().__init__(api_key, host, port=port)
self.channel = channel
self.client_type = channel.channel_type
self.channel_mgr = None
@@ -733,7 +733,7 @@ def start(channel, channel_mgr=None):
return
global chat_client
chat_client = CloudClient(api_key=conf().get("linkai_api_key"), host=conf().get("cloud_host", ""), channel=channel)
chat_client = CloudClient(api_key=conf().get("linkai_api_key"), host=conf().get("cloud_host", ""), port=conf().get("cloud_port"), channel=channel)
chat_client.channel_mgr = channel_mgr
chat_client.config = _build_config()
chat_client.start()

View File

@@ -7,8 +7,8 @@ XUNFEI = "xunfei"
CHATGPTONAZURE = "chatGPTOnAzure"
LINKAI = "linkai"
CLAUDEAPI= "claudeAPI"
QWEN = "qwen" # 旧版千问接入
QWEN_DASHSCOPE = "dashscope" # 新版千问接入(百炼)
QWEN = "qwen" # 千问 (兼容旧配置,实际走 DashscopeBot)
QWEN_DASHSCOPE = "dashscope" # 千问 DashScope 接入
GEMINI = "gemini"
ZHIPU_AI = "zhipu"
MOONSHOT = "moonshot"
@@ -81,14 +81,14 @@ TTS_1_HD = "tts-1-hd"
DEEPSEEK_CHAT = "deepseek-chat" # DeepSeek-V3对话模型
DEEPSEEK_REASONER = "deepseek-reasoner" # DeepSeek-R1模型
# Qwen (通义千问 - 阿里云)
QWEN = "qwen"
# Qwen (通义千问 - 阿里云 DashScope)
QWEN_TURBO = "qwen-turbo"
QWEN_PLUS = "qwen-plus"
QWEN_MAX = "qwen-max"
QWEN_LONG = "qwen-long"
QWEN3_MAX = "qwen3-max" # Qwen3 Max - Agent推荐模型
QWEN35_PLUS = "qwen3.5-plus" # Qwen3.5 Plus - Omni model (MultiModalConversation)
QWEN36_PLUS = "qwen3.6-plus" # Qwen3.6 Plus - Omni model (MultiModalConversation)
QWQ_PLUS = "qwq-plus"
# MiniMax
@@ -172,7 +172,7 @@ MODEL_LIST = [
DEEPSEEK_CHAT, DEEPSEEK_REASONER,
# Qwen
QWEN, QWEN_TURBO, QWEN_PLUS, QWEN_MAX, QWEN_LONG, QWEN3_MAX, QWEN35_PLUS,
QWEN36_PLUS, QWEN35_PLUS, QWEN3_MAX, QWEN_MAX, QWEN_PLUS, QWEN_TURBO, QWEN_LONG,
# MiniMax
MiniMax, MINIMAX_M2_7, MINIMAX_M2_5, MINIMAX_M2_1, MINIMAX_M2_1_LIGHTNING, MINIMAX_M2, MINIMAX_ABAB6_5,

View File

@@ -189,6 +189,7 @@ available_setting = {
"linkai_app_code": "",
"linkai_api_base": "https://api.link-ai.tech", # linkAI服务地址
"cloud_host": "client.link-ai.tech",
"cloud_port": None,
"cloud_deployment_id": "",
"minimax_api_key": "",
"Minimax_group_id": "",

View File

@@ -171,10 +171,10 @@
{
"group": "命令系统",
"pages": [
"commands/index",
"commands/process",
"commands/skill",
"commands/general"
"cli/index",
"cli/process",
"cli/skill",
"cli/general"
]
}
]
@@ -327,15 +327,15 @@
]
},
{
"tab": "Commands",
"tab": "CLI",
"groups": [
{
"group": "Command System",
"pages": [
"en/commands/index",
"en/commands/process",
"en/commands/skill",
"en/commands/chat"
"en/cli/index",
"en/cli/process",
"en/cli/skill",
"en/cli/chat"
]
}
]
@@ -488,15 +488,15 @@
]
},
{
"tab": "コマンド",
"tab": "CLI",
"groups": [
{
"group": "コマンドシステム",
"pages": [
"ja/commands/index",
"ja/commands/process",
"ja/commands/skill",
"ja/commands/general"
"ja/cli/index",
"ja/cli/process",
"ja/cli/skill",
"ja/cli/general"
]
}
]

View File

@@ -76,7 +76,7 @@ irm https://cdn.link-ai.tech/code/cow/run.ps1 | iex
After running, the Web service starts by default. Access `http://localhost:9899/chat` to chat.
Script usage: [One-click Install](https://docs.cowagent.ai/en/guide/quick-start). After installation, you can also use `cow start`, `cow stop`, and other [CLI commands](https://docs.cowagent.ai/en/commands/index) to manage the service.
Script usage: [One-click Install](https://docs.cowagent.ai/en/guide/quick-start). After installation, you can also use `cow start`, `cow stop`, and other [CLI commands](https://docs.cowagent.ai/en/cli/index) to manage the service.
### Manual Installation
@@ -100,7 +100,7 @@ pip3 install -r requirements-optional.txt # optional but recommended
pip3 install -e .
```
After installation, use `cow` commands to manage the service (start, stop, update, etc.) and skills. See [Command Docs](https://docs.cowagent.ai/en/commands/index).
After installation, use `cow` commands to manage the service (start, stop, update, etc.) and skills. See [Command Docs](https://docs.cowagent.ai/en/cli/index).
**4. Install browser (optional)**
@@ -165,7 +165,7 @@ Supports mainstream model providers. Recommended models for Agent mode:
| GLM | `glm-5-turbo` |
| Kimi | `kimi-k2.5` |
| Doubao | `doubao-seed-2-0-code-preview-260215` |
| Qwen | `qwen3.5-plus` |
| Qwen | `qwen3.6-plus` |
| Claude | `claude-sonnet-4-6` |
| Gemini | `gemini-3.1-pro-preview` |
| OpenAI | `gpt-5.4` |

View File

@@ -47,7 +47,7 @@ After installation, use the `cow` command to manage the service:
| `cow update` | Update code and restart |
| `cow install-browser` | Install browser tool dependencies |
See the [Commands documentation](/en/commands/index) for more details.
See the [Commands documentation](/en/cli/index) for more details.
<Note>
If the `cow` command is not available, you can use `./run.sh <command>` (Linux/macOS) or `.\scripts\run.ps1 <command>` (Windows) as a fallback. Both are functionally equivalent.

View File

@@ -117,4 +117,4 @@ cow skill install pptx # Install a skill
cow install-browser # Install browser tool
```
See [Command Overview](https://docs.cowagent.ai/en/commands) for details.
See [Command Overview](https://docs.cowagent.ai/en/cli) for details.

View File

@@ -31,7 +31,7 @@ CowAgent can proactively think and plan tasks, operate computers and external re
<Card title="Tool System" icon="wrench" href="/en/tools/index">
Built-in tools for file I/O, terminal execution, browser automation, scheduled tasks, messaging, and more. The Agent autonomously invokes tools to accomplish complex tasks.
</Card>
<Card title="Command System" icon="terminal" href="/en/commands/index">
<Card title="Command System" icon="terminal" href="/en/cli/index">
Provides terminal CLI and in-chat commands for process management, skill installation, configuration, context inspection, and other common operations.
</Card>
<Card title="Multiple Model Support" icon="microchip" href="/en/models/index">

View File

@@ -6,7 +6,7 @@ description: Supported models and recommended choices for CowAgent
CowAgent supports mainstream LLMs from domestic and international providers. Model interfaces are implemented in the project's `models/` directory.
<Note>
For Agent mode, the following models are recommended based on quality and cost: MiniMax-M2.7, glm-5-turbo, kimi-k2.5, qwen3.5-plus, claude-sonnet-4-6, gemini-3.1-pro-preview
For Agent mode, the following models are recommended based on quality and cost: MiniMax-M2.7, glm-5-turbo, kimi-k2.5, qwen3.6-plus, claude-sonnet-4-6, gemini-3.1-pro-preview
</Note>
## Configuration
@@ -25,7 +25,7 @@ You can also use the [LinkAI](https://link-ai.tech) platform interface to flexib
glm-5-turbo, glm-5 and other series models
</Card>
<Card title="Qwen (Tongyi Qianwen)" href="/en/models/qwen">
qwen3.5-plus, qwen3-max and more
qwen3.6-plus, qwen3-max and more
</Card>
<Card title="Kimi" href="/en/models/kimi">
kimi-k2.5, kimi-k2 and more

View File

@@ -5,14 +5,14 @@ description: Tongyi Qianwen model configuration
```json
{
"model": "qwen3.5-plus",
"model": "qwen3.6-plus",
"dashscope_api_key": "YOUR_API_KEY"
}
```
| Parameter | Description |
| --- | --- |
| `model` | Options include `qwen3.5-plus`, `qwen3-max`, `qwen-max`, `qwen-plus`, `qwen-turbo`, `qwq-plus`, etc. |
| `model` | Options include `qwen3.6-plus`, `qwen3.5-plus`, `qwen3-max`, `qwen-max`, `qwen-plus`, `qwen-turbo`, `qwq-plus`, etc. |
| `dashscope_api_key` | Create at [Bailian Console](https://bailian.console.aliyun.com/?tab=model#/api-key). See [official docs](https://bailian.console.aliyun.com/?tab=api#/api) |
OpenAI-compatible configuration is also supported:
@@ -20,7 +20,7 @@ OpenAI-compatible configuration is also supported:
```json
{
"bot_type": "openai",
"model": "qwen3.5-plus",
"model": "qwen3.6-plus",
"open_ai_api_base": "https://dashscope.aliyuncs.com/compatible-mode/v1",
"open_ai_api_key": "YOUR_API_KEY"
}

View File

@@ -12,7 +12,7 @@ New CLI command system for managing CowAgent from terminal and chat:
- **Web console**: Type `/` in the input box to open a slash command menu, with arrow-key input history
- **Windows support**: New PowerShell script `scripts/run.ps1` with `cow` command support
Docs: [Command Overview](https://docs.cowagent.ai/en/commands)
Docs: [Command Overview](https://docs.cowagent.ai/en/cli)
<img src="https://cdn.link-ai.tech/doc/20260401114549.png" width="750" />

View File

@@ -17,7 +17,7 @@ CowAgent offers multiple ways to acquire skills:
- **URL** — Install from zip archives or SKILL.md links
- **Conversational creation** — Let the Agent create skills through natural language conversation
See [Install Skills](/en/skills/install) and [Skill Management Commands](/en/commands/skill) for details. You can also [create skills](/en/skills/create) through conversation.
See [Install Skills](/en/skills/install) and [Skill Management Commands](/en/cli/skill) for details. You can also [create skills](/en/skills/create) through conversation.
## Skill Loading Priority

View File

@@ -49,5 +49,5 @@ Supports zip archives and SKILL.md file links:
```
<Tip>
All commands above work in the terminal by replacing `/skill` with `cow skill`. See [Skill Management Commands](/en/commands/skill) for full documentation.
All commands above work in the terminal by replacing `/skill` with `cow skill`. See [Skill Management Commands](/en/cli/skill) for full documentation.
</Tip>

72
docs/en/tools/vision.mdx Normal file
View File

@@ -0,0 +1,72 @@
---
title: vision - Image Analysis
description: Analyze image content (recognition, description, OCR, etc.)
---
Analyze local images or image URLs using Vision API. Supports content description, text extraction (OCR), object recognition, and more.
## Model Selection
The vision tool uses a multi-level auto-selection strategy with automatic fallback — no manual configuration required:
1. **Main model** — uses the currently configured main model for image recognition (zero extra cost)
2. **Other configured models** — auto-discovers other models with configured API keys as alternatives
3. **OpenAI** — uses `open_ai_api_key` to call gpt-4.1-mini
4. **LinkAI** — uses `linkai_api_key` to call LinkAI vision service
When `use_linkai=true`, LinkAI is promoted to the highest priority.
If the current provider fails, the tool automatically tries the next one until it succeeds or all fail.
### Supported Models
| Vendor | Vision Model | Notes |
| --- | --- | --- |
| OpenAI / Compatible | Main model | All OpenAI-compatible multimodal models |
| Qwen (DashScope) | Main model | Via MultiModalConversation API |
| Claude | Main model | Anthropic native image format |
| Gemini | Main model | inlineData format |
| Doubao | Main model | doubao-seed-2-0 series natively supported |
| Kimi (Moonshot) | Main model | kimi-k2.5 natively supported |
| ZhipuAI | glm-5v-turbo | Always uses dedicated vision model |
| MiniMax | MiniMax-Text-01 | Always uses dedicated vision model |
<Note>
ZhipuAI and MiniMax text models do not support image understanding, so their dedicated vision models are always used automatically.
</Note>
## Parameters
| Parameter | Type | Required | Description |
| --- | --- | --- | --- |
| `image` | string | Yes | Local file path or HTTP(S) image URL |
| `question` | string | Yes | Question to ask about the image |
Supported image formats: jpg, jpeg, png, gif, webp
## Custom Configuration
To specify a particular model for the vision tool, add to `config.json`:
```json
{
"tool": {
"vision": {
"model": "gpt-4o"
}
}
}
```
In most cases no configuration is needed. The tool works automatically as long as the main model supports multimodal input or any vision-capable API key is configured.
## Use Cases
- Describe image content
- Extract text from images (OCR)
- Identify objects, colors, scenes
- Analyze screenshots and scanned documents
<Note>
Images larger than 1MB are automatically compressed (max edge 1536px). All images (including remote URLs) are converted to base64 for transmission to ensure compatibility with all model backends.
</Note>

View File

@@ -47,7 +47,7 @@ description: 使用脚本一键安装和管理 CowAgent
| `cow update` | 更新代码并重启 |
| `cow install-browser` | 安装浏览器工具依赖 |
更多命令和用法参考 [命令文档](/commands/index)。
更多命令和用法参考 [命令文档](/cli/index)。
<Note>
如果 `cow` 命令不可用,也可以使用 `./run.sh <命令>`Linux/macOS或 `.\scripts\run.ps1 <命令>`Windows作为替代功能等效。

View File

@@ -36,7 +36,7 @@ pip3 install -e .
更新完成后重启服务:
```bash
# 使用 Cow CLI
# 使用 Cow CLI (推荐)
cow restart
# 或使用 run.sh

View File

@@ -118,6 +118,6 @@ cow skill install pptx # 安装技能
cow install-browser # 安装浏览器工具
```
详细命令参考 [命令总览](https://docs.cowagent.ai/commands)。
详细命令参考 [命令总览](https://docs.cowagent.ai/cli)。
<img src="https://cdn.link-ai.tech/doc/20260401114549.png" width="750" />

View File

@@ -36,7 +36,7 @@ CowAgent 支持灵活切换多种模型,能处理文本、语音、图片、
<Card title="工具系统" icon="wrench" href="/tools/index">
内置文件读写、终端执行、浏览器操作、定时任务、消息发送等工具Agent 可自主调用工具完成复杂任务。
</Card>
<Card title="命令系统" icon="terminal" href="/commands/index">
<Card title="命令系统" icon="terminal" href="/cli/index">
提供终端 CLI 和对话中的命令,支持进程管理、技能安装、配置修改、上下文查看等常用操作。
</Card>
<Card title="多模型支持" icon="microchip" href="/models/index">

View File

@@ -76,7 +76,7 @@ irm https://cdn.link-ai.tech/code/cow/run.ps1 | iex
実行後、デフォルトでWebサービスが起動します。`http://localhost:9899/chat` にアクセスしてチャットを開始できます。
スクリプトの使い方: [ワンクリックインストール](https://docs.cowagent.ai/ja/guide/quick-start)。インストール後は `cow start``cow stop` などの [CLI コマンド](https://docs.cowagent.ai/ja/commands/index)でサービスを管理できます。
スクリプトの使い方: [ワンクリックインストール](https://docs.cowagent.ai/ja/guide/quick-start)。インストール後は `cow start``cow stop` などの [CLI コマンド](https://docs.cowagent.ai/ja/cli/index)でサービスを管理できます。
### 手動インストール
@@ -100,7 +100,7 @@ pip3 install -r requirements-optional.txt # 任意ですが推奨
pip3 install -e .
```
インストール後、`cow` コマンドでサービス管理起動、停止、更新などやSkill管理ができます。[コマンドドキュメント](https://docs.cowagent.ai/ja/commands/index)を参照してください。
インストール後、`cow` コマンドでサービス管理起動、停止、更新などやSkill管理ができます。[コマンドドキュメント](https://docs.cowagent.ai/ja/cli/index)を参照してください。
**4. ブラウザのインストール(任意)**
@@ -165,7 +165,7 @@ sudo docker logs -f chatgpt-on-wechat
| GLM | `glm-5-turbo` |
| Kimi | `kimi-k2.5` |
| Doubao | `doubao-seed-2-0-code-preview-260215` |
| Qwen | `qwen3.5-plus` |
| Qwen | `qwen3.6-plus` |
| Claude | `claude-sonnet-4-6` |
| Gemini | `gemini-3.1-pro-preview` |
| OpenAI | `gpt-5.4` |

View File

@@ -47,7 +47,7 @@ Linux、macOS、Windowsに対応しています。Python 3.7〜3.12が必要で
| `cow update` | コードを更新して再起動 |
| `cow install-browser` | ブラウザツールの依存をインストール |
詳細は[コマンドドキュメント](/ja/commands/index)を参照してください。
詳細は[コマンドドキュメント](/ja/cli/index)を参照してください。
<Note>
`cow` コマンドが利用できない場合は、`./run.sh <コマンド>`Linux/macOSまたは `.\scripts\run.ps1 <コマンド>`Windowsで代替できます。機能は同等です。

View File

@@ -117,4 +117,4 @@ cow skill install pptx # Skill をインストール
cow install-browser # ブラウザツールをインストール
```
詳細は [コマンド一覧](https://docs.cowagent.ai/ja/commands) を参照してください。
詳細は [コマンド一覧](https://docs.cowagent.ai/ja/cli) を参照してください。

View File

@@ -31,7 +31,7 @@ CowAgent は自ら思考しタスクを計画し、コンピュータや外部
<Card title="ツールシステム" icon="wrench" href="/ja/tools/index">
ファイル読み書き、ターミナル実行、ブラウザ操作、スケジュールタスク、メッセージ送信などの組み込みツールを提供。Agent が自律的にツールを呼び出して複雑なタスクを完了します。
</Card>
<Card title="コマンドシステム" icon="terminal" href="/ja/commands/index">
<Card title="コマンドシステム" icon="terminal" href="/ja/cli/index">
ターミナル CLI とチャット内コマンドを提供し、プロセス管理、Skill インストール、設定変更、コンテキスト確認などの一般的な操作をサポートします。
</Card>
<Card title="複数モデル対応" icon="microchip" href="/ja/models/index">

View File

@@ -6,7 +6,7 @@ description: CowAgentがサポートするモデルとおすすめの選択肢
CowAgentは国内外の主要なLLMをサポートしています。モデルインターフェースはプロジェクトの`models/`ディレクトリに実装されています。
<Note>
Agent モードでは、品質とコストのバランスから以下のモデルをおすすめします: MiniMax-M2.7、glm-5-turbo、kimi-k2.5、qwen3.5-plus、claude-sonnet-4-6、gemini-3.1-pro-preview
Agent モードでは、品質とコストのバランスから以下のモデルをおすすめします: MiniMax-M2.7、glm-5-turbo、kimi-k2.5、qwen3.6-plus、claude-sonnet-4-6、gemini-3.1-pro-preview
</Note>
## 設定
@@ -25,7 +25,7 @@ CowAgentは国内外の主要なLLMをサポートしています。モデルイ
glm-5-turbo、glm-5およびその他のシリーズモデル
</Card>
<Card title="Qwen (通义千问)" href="/ja/models/qwen">
qwen3.5-plus、qwen3-maxなど
qwen3.6-plus、qwen3-maxなど
</Card>
<Card title="Kimi" href="/ja/models/kimi">
kimi-k2.5、kimi-k2など

View File

@@ -1,18 +1,18 @@
---
title: Qwen (通义千问)
description: 通义千问モデルの設定
title: Qwen (通義千問)
description: 通義千問モデルの設定
---
```json
{
"model": "qwen3.5-plus",
"model": "qwen3.6-plus",
"dashscope_api_key": "YOUR_API_KEY"
}
```
| パラメータ | 説明 |
| --- | --- |
| `model` | `qwen3.5-plus`、`qwen3-max`、`qwen-max`、`qwen-plus`、`qwen-turbo`、`qwq-plus`などから選択可能 |
| `model` | `qwen3.6-plus`、`qwen3.5-plus`、`qwen3-max`、`qwen-max`、`qwen-plus`、`qwen-turbo`、`qwq-plus`などから選択可能 |
| `dashscope_api_key` | [百炼 Console](https://bailian.console.aliyun.com/?tab=model#/api-key)で作成。[公式ドキュメント](https://bailian.console.aliyun.com/?tab=api#/api)を参照 |
OpenAI互換の設定もサポートしています:
@@ -20,7 +20,7 @@ OpenAI互換の設定もサポートしています:
```json
{
"bot_type": "openai",
"model": "qwen3.5-plus",
"model": "qwen3.6-plus",
"open_ai_api_base": "https://dashscope.aliyuncs.com/compatible-mode/v1",
"open_ai_api_key": "YOUR_API_KEY"
}

View File

@@ -12,7 +12,7 @@ description: CowAgent 2.0.5 - Cow CLI、Skill Hub オープンソース、ブラ
- **Web コンソール**:入力欄で `/` を入力するとスラッシュコマンドメニューが表示、矢印キーで入力履歴を辿れる
- **Windows サポート**PowerShell スクリプト `scripts/run.ps1` を追加、`cow` コマンドに対応
ドキュメント:[コマンド一覧](https://docs.cowagent.ai/ja/commands)
ドキュメント:[コマンド一覧](https://docs.cowagent.ai/ja/cli)
<img src="https://cdn.link-ai.tech/doc/20260401114549.png" width="750" />

View File

@@ -17,7 +17,7 @@ CowAgent ではスキルを取得する複数の方法を提供しています
- **URL** — zip アーカイブや SKILL.md リンクからインストール
- **会話で作成** — 自然言語の会話を通じて Agent にスキルを自動作成させる
詳細は[スキルのインストール](/ja/skills/install)と[スキル管理コマンド](/ja/commands/skill)を参照してください。会話を通じて[スキルを作成](/ja/skills/create)することもできます。
詳細は[スキルのインストール](/ja/skills/install)と[スキル管理コマンド](/ja/cli/skill)を参照してください。会話を通じて[スキルを作成](/ja/skills/create)することもできます。
## スキルの読み込み優先順位

View File

@@ -49,5 +49,5 @@ zip アーカイブと SKILL.md ファイルリンクに対応:
```
<Tip>
上記のすべてのコマンドは、ターミナルでは `/skill` を `cow skill` に置き換えて使用できます。完全なコマンドドキュメントは[スキル管理コマンド](/ja/commands/skill)を参照してください。
上記のすべてのコマンドは、ターミナルでは `/skill` を `cow skill` に置き換えて使用できます。完全なコマンドドキュメントは[スキル管理コマンド](/ja/cli/skill)を参照してください。
</Tip>

72
docs/ja/tools/vision.mdx Normal file
View File

@@ -0,0 +1,72 @@
---
title: vision - 画像分析
description: 画像コンテンツの分析認識、説明、OCR など)
---
Vision API を使用してローカル画像や画像 URL を分析します。コンテンツの説明、テキスト抽出OCR、オブジェクト認識などに対応しています。
## モデル選択
Vision ツールは多段階の自動選択+自動フォールバック戦略を採用しており、手動設定なしで利用可能です:
1. **メインモデル** — 現在設定されているメインモデルで画像認識を実行(追加コストなし)
2. **その他の設定済みモデル** — API キーが設定されている他のマルチモーダルモデルを自動検出
3. **OpenAI** — `open_ai_api_key` を使用して gpt-4.1-mini を呼び出し
4. **LinkAI** — `linkai_api_key` を使用して LinkAI ビジョンサービスを呼び出し
`use_linkai=true` の場合、LinkAI が最優先になります。
現在のプロバイダーが失敗した場合、成功するかすべて失敗するまで自動的に次のプロバイダーを試行します。
### 対応モデル
| ベンダー | ビジョンモデル | 説明 |
| --- | --- | --- |
| OpenAI / 互換プロトコル | メインモデル | すべての OpenAI 互換マルチモーダルモデルに対応 |
| 通義千問 (DashScope) | メインモデル | MultiModalConversation API 経由 |
| Claude | メインモデル | Anthropic ネイティブ画像形式 |
| Gemini | メインモデル | inlineData 形式 |
| 豆包 (Doubao) | メインモデル | doubao-seed-2-0 シリーズがネイティブ対応 |
| Kimi (Moonshot) | メインモデル | kimi-k2.5 がネイティブ対応 |
| 智谱 AI | glm-5v-turbo | 常にビジョン専用モデルを使用 |
| MiniMax | MiniMax-Text-01 | 常にビジョン専用モデルを使用 |
<Note>
智谱 AI と MiniMax のテキストモデルは画像理解に対応していないため、対応するビジョン専用モデルが自動的に使用されます。
</Note>
## パラメータ
| パラメータ | 型 | 必須 | 説明 |
| --- | --- | --- | --- |
| `image` | string | はい | ローカルファイルパスまたは HTTP(S) 画像 URL |
| `question` | string | はい | 画像に対する質問 |
対応画像形式jpg、jpeg、png、gif、webp
## カスタム設定
Vision ツールで使用するモデルを指定するには、`config.json` に以下を追加します:
```json
{
"tool": {
"vision": {
"model": "gpt-4o"
}
}
}
```
ほとんどの場合、設定は不要です。メインモデルがマルチモーダルに対応しているか、ビジョン対応の API キーが設定されていれば自動的に動作します。
## ユースケース
- 画像コンテンツの説明
- 画像からのテキスト抽出OCR
- オブジェクト、色、シーンの識別
- スクリーンショットやスキャン文書の分析
<Note>
1MB を超える画像は自動的に圧縮されます(最大辺 1536px。すべての画像リモート URL を含む)は base64 に変換して送信され、すべてのモデルバックエンドとの互換性を確保します。
</Note>

View File

@@ -6,19 +6,20 @@ description: CowAgent 支持的模型及推荐选择
CowAgent 支持国内外主流厂商的大语言模型,模型接口实现在项目的 `models/` 目录下。
<Note>
Agent 模式下推荐使用以下模型可根据效果及成本综合选择MiniMax-M2.7、glm-5-turbo、kimi-k2.5、qwen3.5-plus、claude-sonnet-4-6、gemini-3.1-pro-preview
Agent 模式下推荐使用以下模型可根据效果及成本综合选择MiniMax-M2.7、glm-5-turbo、kimi-k2.5、qwen3.6-plus、claude-sonnet-4-6、gemini-3.1-pro-preview
同时支持使用 [LinkAI](https://link-ai.tech) 平台接口,可灵活切换多种模型,并支持知识库、工作流、插件等 Agent 能力。
</Note>
## 配置方式
根据所选模型,在 `config.json` 中填写对应的模型名称和 API Key 即可。每个模型也支持 OpenAI 兼容方式接入,将 `bot_type` 设为 `openai`,配置 `open_ai_api_base` 和 `open_ai_api_key`。
同时支持使用 [LinkAI](https://link-ai.tech) 平台接口,可灵活切换多种模型,并支持知识库、工作流、插件等 Agent 能力。
也可以通过 [Web 控制台](/channels/web) 在线管理模型配置,无需手动编辑配置文件:
**方式一(推荐):** 通过 [Web 控制台](/channels/web) 在线管理模型配置,无需手动编辑配置文件:
<img width="850" src="https://cdn.link-ai.tech/doc/20260227173811.png" />
**方式二:** 手动编辑 `config.json`,根据所选模型填写对应的模型名称和 API Key。每个模型也支持 OpenAI 兼容方式接入,将 `bot_type` 设为 `openai`,配置 `open_ai_api_base` 和 `open_ai_api_key` 即可。
## 支持的模型
<CardGroup cols={2}>
@@ -29,7 +30,7 @@ CowAgent 支持国内外主流厂商的大语言模型,模型接口实现在
glm-5-turbo、glm-5 等系列模型
</Card>
<Card title="通义千问 Qwen" href="/models/qwen">
qwen3.5-plus、qwen3-max 等
qwen3.6-plus、qwen3-max 等
</Card>
<Card title="Kimi" href="/models/kimi">
kimi-k2.5、kimi-k2 等
@@ -54,6 +55,7 @@ CowAgent 支持国内外主流厂商的大语言模型,模型接口实现在
</Card>
</CardGroup>
<Tip>
全部模型名称可参考项目 [`common/const.py`](https://github.com/zhayujie/chatgpt-on-wechat/blob/master/common/const.py) 文件。
</Tip>

View File

@@ -5,14 +5,14 @@ description: 通义千问模型配置
```json
{
"model": "qwen3.5-plus",
"model": "qwen3.6-plus",
"dashscope_api_key": "YOUR_API_KEY"
}
```
| 参数 | 说明 |
| --- | --- |
| `model` | 可填 `qwen3.5-plus`、`qwen3-max`、`qwen-max`、`qwen-plus`、`qwen-turbo`、`qwq-plus` 等 |
| `model` | 可填 `qwen3.6-plus`、`qwen3.5-plus`、`qwen3-max`、`qwen-max`、`qwen-plus`、`qwen-turbo`、`qwq-plus` 等 |
| `dashscope_api_key` | 在 [百炼控制台](https://bailian.console.aliyun.com/?tab=model#/api-key) 创建,参考 [官方文档](https://bailian.console.aliyun.com/?tab=api#/api) |
也支持 OpenAI 兼容方式接入:
@@ -20,7 +20,7 @@ description: 通义千问模型配置
```json
{
"bot_type": "openai",
"model": "qwen3.5-plus",
"model": "qwen3.6-plus",
"open_ai_api_base": "https://dashscope.aliyuncs.com/compatible-mode/v1",
"open_ai_api_key": "YOUR_API_KEY"
}

View File

@@ -12,7 +12,7 @@ description: CowAgent 2.0.5 - Cow CLI、Skill Hub 开源、浏览器工具、企
- **web控制台**Web 控制台输入框输入 `/` 即可弹出指令菜单,支持方向键回溯历史输入
- **Windows 支持**:新增 PowerShell 一键安装脚本 `scripts/run.ps1`,同时支持 `cow` 命令
相关文档:[命令总览](https://docs.cowagent.ai/commands)
相关文档:[命令总览](https://docs.cowagent.ai/cli)
<img src="https://cdn.link-ai.tech/doc/20260401114549.png" width="750" />

View File

@@ -18,7 +18,7 @@ CowAgent 提供多种方式获取技能:
- **URL** — 从 zip 压缩包或 SKILL.md 链接安装
- **对话创建** — 通过自然语言对话让 Agent 自动创建技能
详细安装方式参考 [安装技能](/skills/install) 和 [技能管理命令](/commands/skill)。也可以通过对话 [创建技能](/skills/create),或向 [Skill Hub](https://skills.cowagent.ai/submit) 贡献你的技能。
详细安装方式参考 [安装技能](/skills/install) 和 [技能管理命令](/cli/skill)。也可以通过对话 [创建技能](/skills/create),或向 [Skill Hub](https://skills.cowagent.ai/submit) 贡献你的技能。
## 技能加载优先级

View File

@@ -62,5 +62,5 @@ CowAgent 支持通过统一的 `install` 命令安装来自 **[Cow 技能广场]
```
<Tip>
以上所有命令在终端中使用时,将 `/skill` 替换为 `cow skill` 即可。完整命令说明参考 [技能管理命令](/commands/skill)。
以上所有命令在终端中使用时,将 `/skill` 替换为 `cow skill` 即可。完整命令说明参考 [技能管理命令](/cli/skill)。
</Tip>

View File

@@ -5,14 +5,49 @@ description: 分析图片内容识别、描述、OCR 等)
使用 Vision API 分析本地图片或图片 URL支持内容描述、文字提取OCR、物体识别等。
## 依赖
## 模型选择
需要配置至少一个 API Key通过 `env_config` 工具或工作空间 `.env` 文件配置)
Vision 工具采用多级自动选择 + 自动兜底策略,无需手动配置即可使用
| 后端 | 环境变量 | 优先级 |
1. **主模型** — 优先使用当前配置的主模型进行图像识别(需要是多模态模型)
2. **其他已配置模型** — 自动发现已配置 API Key 的其他多模态模型作为备选
如果当前 provider 调用失败,会自动尝试下一个,直到成功或全部失败。
### 支持的模型
| 厂商 | 视觉模型 | 说明 |
| --- | --- | --- |
| OpenAI | `OPENAI_API_KEY` | 优先使用 |
| LinkAI | `LINKAI_API_KEY` | 备选 |
| OpenAI / 兼容协议 | 使用主模型 | 支持所有 OpenAI 协议兼容的多模态模型 |
| 通义千问 (DashScope) | 使用主模型 | 例如 qwen3.6-plus 等 |
| Claude | 使用主模型 | Anthropic 原生图像格式 |
| Gemini | 使用主模型 | inlineData 格式 |
| 豆包 (Doubao) | 使用主模型 | doubao-seed-2-0 系列原生支持 |
| Kimi (Moonshot) | 使用主模型 | kimi-k2.5 原生支持 |
| 智谱 AI | glm-5v-turbo | 固定使用视觉专用模型 |
| MiniMax | MiniMax-Text-01 | 固定使用视觉专用模型 |
<Note>
智谱和 MiniMax 的文本模型不支持图像理解,因此始终使用对应的视觉专用模型,无需手动指定。
</Note>
> 当 `use_linkai=true` 时,默认使用 LinkAI 的多模态模型进行
## 自定义配置
如果希望指定 Vision 使用的模型,可在 `config.json` 中配置,例如:
```json
{
"tool": {
"vision": {
"model": "gpt-4o"
}
}
}
```
大多数情况下无需配置,主模型支持多模态或配置任意一个支持视觉的 API Key 即可自动工作。
## 参数
@@ -20,17 +55,18 @@ description: 分析图片内容识别、描述、OCR 等)
| --- | --- | --- | --- |
| `image` | string | 是 | 本地文件路径或 HTTP(S) 图片 URL |
| `question` | string | 是 | 对图片提出的问题 |
| `model` | string | 否 | 模型名称(默认 gpt-4.1-mini |
支持的图片格式jpg、jpeg、png、gif、webp
## 使用场景
- 描述图片中的内容
- 提取图片中的文字OCR
- 识别物体、颜色、场景
- 分析截图、文档扫描
- 分析截图、文档扫描图片等
<Note>
超过 1MB 的图片会自动压缩后上传。如果未配置任何 Vision API Key该工具不会被加载
超过 1MB 的图片会自动压缩后上传,所有图片(包括远程 URL会统一转为 base64 传输,确保兼容所有模型后端
</Note>

View File

@@ -1,214 +0,0 @@
# encoding:utf-8
import json
import time
from typing import List, Tuple
import openai
from models.openai.openai_compat import RateLimitError, Timeout, APIError, APIConnectionError
import broadscope_bailian
from broadscope_bailian import ChatQaMessage
from models.bot import Bot
from models.ali.ali_qwen_session import AliQwenSession
from models.session_manager import SessionManager
from bridge.context import ContextType
from bridge.reply import Reply, ReplyType
from common.log import logger
from common import const
from config import conf, load_config
class AliQwenBot(Bot):
def __init__(self):
super().__init__()
self.api_key_expired_time = self.set_api_key()
self.sessions = SessionManager(AliQwenSession, model=conf().get("model", const.QWEN))
def api_key_client(self):
return broadscope_bailian.AccessTokenClient(access_key_id=self.access_key_id(), access_key_secret=self.access_key_secret())
def access_key_id(self):
return conf().get("qwen_access_key_id")
def access_key_secret(self):
return conf().get("qwen_access_key_secret")
def agent_key(self):
return conf().get("qwen_agent_key")
def app_id(self):
return conf().get("qwen_app_id")
def node_id(self):
return conf().get("qwen_node_id", "")
def temperature(self):
return conf().get("temperature", 0.2 )
def top_p(self):
return conf().get("top_p", 1)
def reply(self, query, context=None):
# acquire reply content
if context.type == ContextType.TEXT:
logger.info("[QWEN] query={}".format(query))
session_id = context["session_id"]
reply = None
clear_memory_commands = conf().get("clear_memory_commands", ["#清除记忆"])
if query in clear_memory_commands:
self.sessions.clear_session(session_id)
reply = Reply(ReplyType.INFO, "记忆已清除")
elif query == "#清除所有":
self.sessions.clear_all_session()
reply = Reply(ReplyType.INFO, "所有人记忆已清除")
elif query == "#更新配置":
load_config()
reply = Reply(ReplyType.INFO, "配置已更新")
if reply:
return reply
session = self.sessions.session_query(query, session_id)
logger.debug("[QWEN] session query={}".format(session.messages))
reply_content = self.reply_text(session)
logger.debug(
"[QWEN] new_query={}, session_id={}, reply_cont={}, completion_tokens={}".format(
session.messages,
session_id,
reply_content["content"],
reply_content["completion_tokens"],
)
)
if reply_content["completion_tokens"] == 0 and len(reply_content["content"]) > 0:
reply = Reply(ReplyType.ERROR, reply_content["content"])
elif reply_content["completion_tokens"] > 0:
self.sessions.session_reply(reply_content["content"], session_id, reply_content["total_tokens"])
reply = Reply(ReplyType.TEXT, reply_content["content"])
else:
reply = Reply(ReplyType.ERROR, reply_content["content"])
logger.debug("[QWEN] reply {} used 0 tokens.".format(reply_content))
return reply
else:
reply = Reply(ReplyType.ERROR, "Bot不支持处理{}类型的消息".format(context.type))
return reply
def reply_text(self, session: AliQwenSession, retry_count=0) -> dict:
"""
call bailian's ChatCompletion to get the answer
:param session: a conversation session
:param retry_count: retry count
:return: {}
"""
try:
prompt, history = self.convert_messages_format(session.messages)
self.update_api_key_if_expired()
# NOTE 阿里百炼的call()函数未提供temperature参数考虑到temperature和top_p参数作用相同取两者较小的值作为top_p参数传入详情见文档 https://help.aliyun.com/document_detail/2587502.htm
response = broadscope_bailian.Completions().call(app_id=self.app_id(), prompt=prompt, history=history, top_p=min(self.temperature(), self.top_p()))
completion_content = self.get_completion_content(response, self.node_id())
completion_tokens, total_tokens = self.calc_tokens(session.messages, completion_content)
return {
"total_tokens": total_tokens,
"completion_tokens": completion_tokens,
"content": completion_content,
}
except Exception as e:
need_retry = retry_count < 2
result = {"completion_tokens": 0, "content": "我现在有点累了,等会再来吧"}
if isinstance(e, RateLimitError):
logger.warn("[QWEN] RateLimitError: {}".format(e))
result["content"] = "提问太快啦,请休息一下再问我吧"
if need_retry:
time.sleep(20)
elif isinstance(e, Timeout):
logger.warn("[QWEN] Timeout: {}".format(e))
result["content"] = "我没有收到你的消息"
if need_retry:
time.sleep(5)
elif isinstance(e, APIError):
logger.warn("[QWEN] Bad Gateway: {}".format(e))
result["content"] = "请再问我一次"
if need_retry:
time.sleep(10)
elif isinstance(e, APIConnectionError):
logger.warn("[QWEN] APIConnectionError: {}".format(e))
need_retry = False
result["content"] = "我连接不到你的网络"
else:
logger.exception("[QWEN] Exception: {}".format(e))
need_retry = False
self.sessions.clear_session(session.session_id)
if need_retry:
logger.warn("[QWEN] 第{}次重试".format(retry_count + 1))
return self.reply_text(session, retry_count + 1)
else:
return result
def set_api_key(self):
api_key, expired_time = self.api_key_client().create_token(agent_key=self.agent_key())
broadscope_bailian.api_key = api_key
return expired_time
def update_api_key_if_expired(self):
if time.time() > self.api_key_expired_time:
self.api_key_expired_time = self.set_api_key()
def convert_messages_format(self, messages) -> Tuple[str, List[ChatQaMessage]]:
history = []
user_content = ''
assistant_content = ''
system_content = ''
for message in messages:
role = message.get('role')
if role == 'user':
user_content += message.get('content')
elif role == 'assistant':
assistant_content = message.get('content')
history.append(ChatQaMessage(user_content, assistant_content))
user_content = ''
assistant_content = ''
elif role =='system':
system_content += message.get('content')
if user_content == '':
raise Exception('no user message')
if system_content != '':
# NOTE 模拟系统消息,测试发现人格描述以"你需要扮演ChatGPT"开头能够起作用,而以"你是ChatGPT"开头模型会直接否认
system_qa = ChatQaMessage(system_content, '好的,我会严格按照你的设定回答问题')
history.insert(0, system_qa)
logger.debug("[QWEN] converted qa messages: {}".format([item.to_dict() for item in history]))
logger.debug("[QWEN] user content as prompt: {}".format(user_content))
return user_content, history
def get_completion_content(self, response, node_id):
if not response['Success']:
return f"[ERROR]\n{response['Code']}:{response['Message']}"
text = response['Data']['Text']
if node_id == '':
return text
# TODO: 当使用流程编排创建大模型应用时,响应结构如下,最终结果在['finalResult'][node_id]['response']['text']中,暂时先这么写
# {
# 'Success': True,
# 'Code': None,
# 'Message': None,
# 'Data': {
# 'ResponseId': '9822f38dbacf4c9b8daf5ca03a2daf15',
# 'SessionId': 'session_id',
# 'Text': '{"finalResult":{"LLM_T7islK":{"params":{"modelId":"qwen-plus-v1","prompt":"${systemVars.query}${bizVars.Text}"},"response":{"text":"作为一个AI语言模型我没有年龄因为我没有生日。\n我只是一个程序没有生命和身体。"}}}}',
# 'Thoughts': [],
# 'Debug': {},
# 'DocReferences': []
# },
# 'RequestId': '8e11d31551ce4c3f83f49e6e0dd998b0',
# 'Failed': None
# }
text_dict = json.loads(text)
completion_content = text_dict['finalResult'][node_id]['response']['text']
return completion_content
def calc_tokens(self, messages, completion_content):
completion_tokens = len(completion_content)
prompt_tokens = 0
for message in messages:
prompt_tokens += len(message["content"])
return completion_tokens, prompt_tokens + completion_tokens

View File

@@ -1,62 +0,0 @@
from models.session_manager import Session
from common.log import logger
"""
e.g.
[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who won the world series in 2020?"},
{"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
{"role": "user", "content": "Where was it played?"}
]
"""
class AliQwenSession(Session):
def __init__(self, session_id, system_prompt=None, model="qianwen"):
super().__init__(session_id, system_prompt)
self.model = model
self.reset()
def discard_exceeding(self, max_tokens, cur_tokens=None):
precise = True
try:
cur_tokens = self.calc_tokens()
except Exception as e:
precise = False
if cur_tokens is None:
raise e
logger.debug("Exception when counting tokens precisely for query: {}".format(e))
while cur_tokens > max_tokens:
if len(self.messages) > 2:
self.messages.pop(1)
elif len(self.messages) == 2 and self.messages[1]["role"] == "assistant":
self.messages.pop(1)
if precise:
cur_tokens = self.calc_tokens()
else:
cur_tokens = cur_tokens - max_tokens
break
elif len(self.messages) == 2 and self.messages[1]["role"] == "user":
logger.warn("user message exceed max_tokens. total_tokens={}".format(cur_tokens))
break
else:
logger.debug("max_tokens={}, total_tokens={}, len(messages)={}".format(max_tokens, cur_tokens, len(self.messages)))
break
if precise:
cur_tokens = self.calc_tokens()
else:
cur_tokens = cur_tokens - max_tokens
return cur_tokens
def calc_tokens(self):
return num_tokens_from_messages(self.messages, self.model)
def num_tokens_from_messages(messages, model):
"""Returns the number of tokens used by a list of messages."""
# 官方token计算规则"对于中文文本来说1个token通常对应一个汉字对于英文文本来说1个token通常对应3至4个字母或1个单词"
# 详情请产看文档https://help.aliyun.com/document_detail/2586397.html
# 目前根据字符串长度粗略估计token数不影响正常使用
tokens = 0
for msg in messages:
tokens += len(msg["content"])
return tokens

View File

@@ -2,12 +2,27 @@
Auto-replay chat robot abstract class
"""
from bridge.context import Context
from bridge.reply import Reply
class Bot(object):
"""
Base class for all chat-bot implementations.
Subclasses may also implement:
call_with_tools(messages, tools=None, stream=False, **kwargs)
-> dict | generator (OpenAI-compatible format)
call_vision(image_url, question, model=None, max_tokens=1000)
-> dict with keys: model, content, usage (or error/message)
These are NOT defined here to avoid shadowing concrete implementations
provided by mixin classes (e.g. OpenAICompatibleBot) in the MRO.
Use ``hasattr(bot, 'call_vision')`` to detect support at runtime.
"""
def reply(self, query, context: Context = None) -> Reply:
"""
bot auto-reply content

View File

@@ -46,10 +46,7 @@ def create_bot(bot_type):
elif bot_type == const.CLAUDEAPI:
from models.claudeapi.claude_api_bot import ClaudeAPIBot
return ClaudeAPIBot()
elif bot_type == const.QWEN:
from models.ali.ali_qwen_bot import AliQwenBot
return AliQwenBot()
elif bot_type == const.QWEN_DASHSCOPE:
elif bot_type in (const.QWEN, const.QWEN_DASHSCOPE):
from models.dashscope.dashscope_bot import DashscopeBot
return DashscopeBot()
elif bot_type == const.GEMINI:

View File

@@ -1,7 +1,10 @@
# encoding:utf-8
import base64
import json
import re
import time
from typing import Optional
import requests
@@ -224,6 +227,79 @@ class ClaudeAPIBot(Bot, OpenAIImage):
return 64000
return 8192
@staticmethod
def _parse_data_url(data_url: str):
"""Parse a data:<mime>;base64,<data> URL into (media_type, base64_data)."""
m = re.match(r"^data:([^;]+);base64,(.+)$", data_url, re.DOTALL)
if m:
return m.group(1), m.group(2)
return None, None
def call_vision(self, image_url: str, question: str,
model: Optional[str] = None,
max_tokens: int = 1000) -> dict:
"""Analyze an image using Claude Messages API (native image blocks)."""
try:
actual_model = model or self._model_mapping(conf().get("model"))
# Build Claude-native image content block
if image_url.startswith("data:"):
media_type, b64_data = self._parse_data_url(image_url)
if not b64_data:
return {"error": True, "message": "Invalid base64 data URL"}
image_block = {
"type": "image",
"source": {"type": "base64",
"media_type": media_type or "image/jpeg",
"data": b64_data},
}
else:
image_block = {
"type": "image",
"source": {"type": "url", "url": image_url},
}
data = {
"model": actual_model,
"max_tokens": max_tokens,
"messages": [{
"role": "user",
"content": [
image_block,
{"type": "text", "text": question},
],
}],
}
headers = {
"x-api-key": self.api_key,
"anthropic-version": "2023-06-01",
"content-type": "application/json",
}
proxies = {"http": self.proxy, "https": self.proxy} if self.proxy else None
resp = requests.post(f"{self.api_base}/messages",
headers=headers, json=data, proxies=proxies)
if resp.status_code != 200:
return {"error": True, "message": f"HTTP {resp.status_code}: {resp.text[:300]}"}
body = resp.json()
text_parts = [b.get("text", "") for b in body.get("content", [])
if b.get("type") == "text"]
usage = body.get("usage", {})
return {
"model": actual_model,
"content": "".join(text_parts),
"usage": {
"prompt_tokens": usage.get("input_tokens", 0),
"completion_tokens": usage.get("output_tokens", 0),
"total_tokens": usage.get("input_tokens", 0) + usage.get("output_tokens", 0),
},
}
except Exception as e:
logger.error(f"[CLAUDE] call_vision error: {e}")
return {"error": True, "message": str(e)}
def call_with_tools(self, messages, tools=None, stream=False, **kwargs):
"""
Call Claude API with tool support for agent integration

View File

@@ -1,6 +1,8 @@
# encoding:utf-8
import json
from typing import Optional
from models.bot import Bot
from models.session_manager import SessionManager
from bridge.context import ContextType
@@ -26,15 +28,15 @@ dashscope_models = {
# Model name prefixes that require MultiModalConversation API instead of Generation API.
# Qwen3.5+ series are omni models that only support MultiModalConversation.
MULTIMODAL_MODEL_PREFIXES = ("qwen3.5-",)
MULTIMODAL_MODEL_PREFIXES = ("qwen3.5-", "qwen3.6-")
# Qwen对话模型API
class DashscopeBot(Bot):
def __init__(self):
super().__init__()
self.sessions = SessionManager(DashscopeSession, model=conf().get("model") or "qwen-plus")
self.model_name = conf().get("model") or "qwen-plus"
self.sessions = SessionManager(DashscopeSession, model=conf().get("model") or "qwen3.6-plus")
self.model_name = conf().get("model") or "qwen3.6-plus"
self.client = dashscope.Generation
api_key = conf().get("dashscope_api_key")
if api_key:
@@ -153,6 +155,56 @@ class DashscopeBot(Bot):
else:
return result
def call_vision(self, image_url: str, question: str,
model: Optional[str] = None,
max_tokens: int = 1000) -> dict:
"""Analyze an image using DashScope MultiModalConversation API."""
try:
dashscope.api_key = self.api_key
vision_model = model or "qwen-vl-max"
# DashScope multimodal format: {"image": url} + {"text": question}
messages = [{
"role": "user",
"content": [
{"image": image_url},
{"text": question},
],
}]
response = MultiModalConversation.call(
model=vision_model,
messages=messages,
max_tokens=max_tokens,
)
if response.status_code != HTTPStatus.OK:
return {
"error": True,
"message": f"{response.code} - {response.message}",
}
resp_dict = self._response_to_dict(response)
choice = resp_dict["output"]["choices"][0]
content = choice.get("message", {}).get("content", "")
if isinstance(content, list):
content = "".join(
item.get("text", "") for item in content if isinstance(item, dict)
)
usage = resp_dict.get("usage", {})
return {
"model": vision_model,
"content": content,
"usage": {
"prompt_tokens": usage.get("input_tokens", 0),
"completion_tokens": usage.get("output_tokens", 0),
"total_tokens": usage.get("total_tokens", 0),
},
}
except Exception as e:
logger.error(f"[DASHSCOPE] call_vision error: {e}")
return {"error": True, "message": str(e)}
def call_with_tools(self, messages, tools=None, stream=False, **kwargs):
"""
Call DashScope API with tool support for agent integration

View File

@@ -2,6 +2,7 @@
import json
import time
from typing import Optional
import requests
from models.bot import Bot
@@ -147,6 +148,49 @@ class DoubaoBot(Bot):
else:
return result
def call_vision(self, image_url: str, question: str,
model: Optional[str] = None,
max_tokens: int = 1000) -> dict:
"""Analyze an image using Doubao (Volcengine Ark) OpenAI-compatible API."""
try:
vision_model = model or self.args.get("model", "doubao-seed-2-0-pro-260215")
payload = {
"model": vision_model,
"max_tokens": max_tokens,
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": question},
{"type": "image_url", "image_url": {"url": image_url}},
],
}],
}
headers = {
"Authorization": f"Bearer {self.api_key}",
"Content-Type": "application/json",
}
resp = requests.post(f"{self.base_url}/chat/completions",
headers=headers, json=payload, timeout=60)
if resp.status_code != 200:
return {"error": True, "message": f"HTTP {resp.status_code}: {resp.text[:300]}"}
data = resp.json()
if "error" in data:
return {"error": True, "message": data["error"].get("message", str(data["error"]))}
content = data.get("choices", [{}])[0].get("message", {}).get("content", "")
usage = data.get("usage", {})
return {
"model": vision_model,
"content": content,
"usage": {
"prompt_tokens": usage.get("prompt_tokens", 0),
"completion_tokens": usage.get("completion_tokens", 0),
"total_tokens": usage.get("total_tokens", 0),
},
}
except Exception as e:
logger.error(f"[DOUBAO] call_vision error: {e}")
return {"error": True, "message": str(e)}
# ==================== Agent mode support ====================
def call_with_tools(self, messages, tools=None, stream: bool = False, **kwargs):
@@ -434,31 +478,37 @@ class DoubaoBot(Bot):
continue
if role == "user":
text_parts = []
tool_results = []
has_tool_result = any(
isinstance(b, dict) and b.get("type") == "tool_result" for b in content
)
if has_tool_result:
text_parts = []
tool_results = []
for block in content:
if not isinstance(block, dict):
continue
if block.get("type") == "text":
text_parts.append(block.get("text", ""))
elif block.get("type") == "tool_result":
tool_call_id = block.get("tool_use_id") or ""
result_content = block.get("content", "")
if not isinstance(result_content, str):
result_content = json.dumps(result_content, ensure_ascii=False)
tool_results.append({
"role": "tool",
"tool_call_id": tool_call_id,
"content": result_content
})
for block in content:
if not isinstance(block, dict):
continue
if block.get("type") == "text":
text_parts.append(block.get("text", ""))
elif block.get("type") == "tool_result":
tool_call_id = block.get("tool_use_id") or ""
result_content = block.get("content", "")
if not isinstance(result_content, str):
result_content = json.dumps(result_content, ensure_ascii=False)
tool_results.append({
"role": "tool",
"tool_call_id": tool_call_id,
"content": result_content
})
# Tool results first (must come right after assistant with tool_calls)
for tr in tool_results:
converted.append(tr)
for tr in tool_results:
converted.append(tr)
if text_parts:
converted.append({"role": "user", "content": "\n".join(text_parts)})
if text_parts:
converted.append({"role": "user", "content": "\n".join(text_parts)})
else:
# Keep as-is for multimodal content (e.g. image_url blocks)
converted.append(msg)
elif role == "assistant":
openai_msg = {"role": "assistant"}

View File

@@ -12,6 +12,8 @@ import mimetypes
import os
import re
import time
from typing import Optional
import requests
from models.bot import Bot
from models.session_manager import SessionManager
@@ -144,7 +146,12 @@ class GoogleGeminiBot(Bot):
return "", []
pattern = r"\[图片:\s*([^\]]+)\]"
image_paths = [m.strip().strip("'\"") for m in re.findall(pattern, content) if m.strip()]
cleaned_text = re.sub(pattern, "", content)
# Replace markers with path-only hints so the model still knows the
# original file location (needed when it calls tools like vision).
def _replace_with_hint(m):
path = m.group(1).strip().strip("'\"")
return f"[attached image: {path}]"
cleaned_text = re.sub(pattern, _replace_with_hint, content)
cleaned_text = re.sub(r"\n{3,}", "\n\n", cleaned_text).strip()
return cleaned_text, image_paths
@@ -225,6 +232,57 @@ class GoogleGeminiBot(Bot):
logger.warning(f"[Gemini] Unsupported image URL format: {image_url[:120]}")
return None
def call_vision(self, image_url: str, question: str,
model: Optional[str] = None,
max_tokens: int = 1000) -> dict:
"""Analyze an image using Gemini REST API."""
try:
model_name = model or self.model or "gemini-2.0-flash"
image_part = self._build_inline_part_from_image_url({"url": image_url})
if not image_part:
return {"error": True, "message": f"Cannot process image URL: {image_url[:120]}"}
payload = {
"contents": [{
"role": "user",
"parts": [image_part, {"text": question}],
}],
"generationConfig": {"maxOutputTokens": max_tokens},
"safetySettings": [
{"category": "HARM_CATEGORY_HATE_SPEECH", "threshold": "BLOCK_NONE"},
{"category": "HARM_CATEGORY_HARASSMENT", "threshold": "BLOCK_NONE"},
{"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT", "threshold": "BLOCK_NONE"},
{"category": "HARM_CATEGORY_DANGEROUS_CONTENT", "threshold": "BLOCK_NONE"},
],
}
endpoint = f"{self.api_base}/v1beta/models/{model_name}:generateContent"
headers = {"x-goog-api-key": self.api_key, "Content-Type": "application/json"}
resp = requests.post(endpoint, headers=headers, json=payload, timeout=60)
if resp.status_code != 200:
return {"error": True, "message": f"HTTP {resp.status_code}: {resp.text[:300]}"}
body = resp.json()
candidates = body.get("candidates", [])
text_parts = []
for part in candidates[0].get("content", {}).get("parts", []) if candidates else []:
if "text" in part:
text_parts.append(part["text"])
usage_meta = body.get("usageMetadata", {})
return {
"model": model_name,
"content": "".join(text_parts),
"usage": {
"prompt_tokens": usage_meta.get("promptTokenCount", 0),
"completion_tokens": usage_meta.get("candidatesTokenCount", 0),
"total_tokens": usage_meta.get("totalTokenCount", 0),
},
}
except Exception as e:
logger.error(f"[Gemini] call_vision error: {e}")
return {"error": True, "message": str(e)}
def call_with_tools(self, messages, tools=None, stream=False, **kwargs):
"""
Call Gemini API with tool support using REST API (following official docs)

View File

@@ -2,6 +2,8 @@
import time
import json
from typing import Optional
import requests
from models.bot import Bot
@@ -175,6 +177,51 @@ class MinimaxBot(Bot):
else:
return result
def call_vision(self, image_url: str, question: str,
model: Optional[str] = None,
max_tokens: int = 1000) -> dict:
"""Analyze an image using MiniMax OpenAI-compatible API.
Always uses MiniMax-Text-01 — other MiniMax models do not support vision.
"""
try:
vision_model = "MiniMax-Text-01"
payload = {
"model": vision_model,
"max_tokens": max_tokens,
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": question},
{"type": "image_url", "image_url": {"url": image_url}},
],
}],
}
headers = {
"Authorization": f"Bearer {self.api_key}",
"Content-Type": "application/json",
}
resp = requests.post(f"{self.api_base}/chat/completions",
headers=headers, json=payload, timeout=60)
if resp.status_code != 200:
return {"error": True, "message": f"HTTP {resp.status_code}: {resp.text[:300]}"}
data = resp.json()
if "error" in data:
return {"error": True, "message": data["error"].get("message", str(data["error"]))}
content = data.get("choices", [{}])[0].get("message", {}).get("content", "")
usage = data.get("usage", {})
return {
"model": vision_model,
"content": content,
"usage": {
"prompt_tokens": usage.get("prompt_tokens", 0),
"completion_tokens": usage.get("completion_tokens", 0),
"total_tokens": usage.get("total_tokens", 0),
},
}
except Exception as e:
logger.error(f"[MINIMAX] call_vision error: {e}")
return {"error": True, "message": str(e)}
def call_with_tools(self, messages, tools=None, stream=False, **kwargs):
"""
Call MiniMax API with tool support for agent integration
@@ -273,37 +320,41 @@ class MinimaxBot(Bot):
if role == "user":
# Handle user message
if isinstance(content, list):
# Extract text from content blocks
text_parts = []
tool_results = []
has_tool_result = any(
isinstance(b, dict) and b.get("type") == "tool_result" for b in content
)
if has_tool_result:
text_parts = []
tool_results = []
for block in content:
if isinstance(block, dict):
if block.get("type") == "text":
text_parts.append(block.get("text", ""))
elif block.get("type") == "tool_result":
# Tool result should be a separate message with role="tool"
tool_call_id = block.get("tool_use_id") or ""
if not tool_call_id:
logger.warning(f"[MINIMAX] tool_result missing tool_use_id")
result_content = block.get("content", "")
if not isinstance(result_content, str):
result_content = json.dumps(result_content, ensure_ascii=False)
tool_results.append({
"role": "tool",
"tool_call_id": tool_call_id,
"content": result_content
})
for block in content:
if isinstance(block, dict):
if block.get("type") == "text":
text_parts.append(block.get("text", ""))
elif block.get("type") == "tool_result":
tool_call_id = block.get("tool_use_id") or ""
if not tool_call_id:
logger.warning(f"[MINIMAX] tool_result missing tool_use_id")
result_content = block.get("content", "")
if not isinstance(result_content, str):
result_content = json.dumps(result_content, ensure_ascii=False)
tool_results.append({
"role": "tool",
"tool_call_id": tool_call_id,
"content": result_content
})
if text_parts:
converted.append({
"role": "user",
"content": "\n".join(text_parts)
})
if text_parts:
converted.append({
"role": "user",
"content": "\n".join(text_parts)
})
# Add all tool results (not just the last one)
for tool_result in tool_results:
converted.append(tool_result)
for tool_result in tool_results:
converted.append(tool_result)
else:
# Keep as-is for multimodal content (e.g. image_url blocks)
converted.append(msg)
else:
# Simple text content
converted.append({

View File

@@ -2,6 +2,7 @@
import json
import time
from typing import Optional
import requests
from models.bot import Bot
@@ -147,6 +148,49 @@ class MoonshotBot(Bot):
else:
return result
def call_vision(self, image_url: str, question: str,
model: Optional[str] = None,
max_tokens: int = 1000) -> dict:
"""Analyze an image using Moonshot (Kimi) OpenAI-compatible API."""
try:
vision_model = model or self.args.get("model", "kimi-k2.5")
payload = {
"model": vision_model,
"max_tokens": max_tokens,
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": question},
{"type": "image_url", "image_url": {"url": image_url}},
],
}],
}
headers = {
"Authorization": f"Bearer {self.api_key}",
"Content-Type": "application/json",
}
resp = requests.post(f"{self.base_url}/chat/completions",
headers=headers, json=payload, timeout=60)
if resp.status_code != 200:
return {"error": True, "message": f"HTTP {resp.status_code}: {resp.text[:300]}"}
data = resp.json()
if "error" in data:
return {"error": True, "message": data["error"].get("message", str(data["error"]))}
content = data.get("choices", [{}])[0].get("message", {}).get("content", "")
usage = data.get("usage", {})
return {
"model": vision_model,
"content": content,
"usage": {
"prompt_tokens": usage.get("prompt_tokens", 0),
"completion_tokens": usage.get("completion_tokens", 0),
"total_tokens": usage.get("total_tokens", 0),
},
}
except Exception as e:
logger.error(f"[MOONSHOT] call_vision error: {e}")
return {"error": True, "message": str(e)}
# ==================== Agent mode support ====================
def call_with_tools(self, messages, tools=None, stream: bool = False, **kwargs):
@@ -435,31 +479,37 @@ class MoonshotBot(Bot):
continue
if role == "user":
text_parts = []
tool_results = []
has_tool_result = any(
isinstance(b, dict) and b.get("type") == "tool_result" for b in content
)
if has_tool_result:
text_parts = []
tool_results = []
for block in content:
if not isinstance(block, dict):
continue
if block.get("type") == "text":
text_parts.append(block.get("text", ""))
elif block.get("type") == "tool_result":
tool_call_id = block.get("tool_use_id") or ""
result_content = block.get("content", "")
if not isinstance(result_content, str):
result_content = json.dumps(result_content, ensure_ascii=False)
tool_results.append({
"role": "tool",
"tool_call_id": tool_call_id,
"content": result_content
})
for block in content:
if not isinstance(block, dict):
continue
if block.get("type") == "text":
text_parts.append(block.get("text", ""))
elif block.get("type") == "tool_result":
tool_call_id = block.get("tool_use_id") or ""
result_content = block.get("content", "")
if not isinstance(result_content, str):
result_content = json.dumps(result_content, ensure_ascii=False)
tool_results.append({
"role": "tool",
"tool_call_id": tool_call_id,
"content": result_content
})
# Tool results first (must come right after assistant with tool_calls)
for tr in tool_results:
converted.append(tr)
for tr in tool_results:
converted.append(tr)
if text_parts:
converted.append({"role": "user", "content": "\n".join(text_parts)})
if text_parts:
converted.append({"role": "user", "content": "\n".join(text_parts)})
else:
# Keep as-is for multimodal content (e.g. image_url blocks)
converted.append(msg)
elif role == "assistant":
openai_msg = {"role": "assistant"}

View File

@@ -9,6 +9,8 @@ This includes: OpenAI, LinkAI, Azure OpenAI, and many third-party providers.
import json
import openai
import requests
from typing import Optional
from common.log import logger
from agent.protocol.message_utils import drop_orphaned_tool_results_openai
@@ -306,3 +308,51 @@ class OpenAICompatibleBot:
openai_messages.append(msg)
return drop_orphaned_tool_results_openai(openai_messages)
def call_vision(self, image_url: str, question: str,
model: Optional[str] = None,
max_tokens: int = 1000) -> dict:
"""Analyze an image using the OpenAI-compatible /chat/completions endpoint."""
try:
api_config = self.get_api_config()
vision_model = model or api_config.get("model", "gpt-4o")
api_key = api_config.get("api_key", "")
api_base = (api_config.get("api_base") or "https://api.openai.com/v1").rstrip("/")
payload = {
"model": vision_model,
"messages": [{
"role": "user",
"content": [
{"type": "text", "text": question},
{"type": "image_url", "image_url": {"url": image_url}},
],
}],
}
headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json",
}
resp = requests.post(
f"{api_base}/chat/completions",
headers=headers, json=payload, timeout=60,
)
if resp.status_code != 200:
body = resp.text[:500]
logger.error(f"[{self.__class__.__name__}] call_vision HTTP {resp.status_code}: {body}")
return {"error": True, "message": f"HTTP {resp.status_code}: {body}"}
data = resp.json()
content = data.get("choices", [{}])[0].get("message", {}).get("content", "")
usage = data.get("usage", {})
return {
"model": vision_model,
"content": content,
"usage": {
"prompt_tokens": usage.get("prompt_tokens", 0),
"completion_tokens": usage.get("completion_tokens", 0),
"total_tokens": usage.get("total_tokens", 0),
},
}
except Exception as e:
logger.error(f"[{self.__class__.__name__}] call_vision error: {e}")
return {"error": True, "message": str(e)}

View File

@@ -2,6 +2,7 @@
import time
import json
from typing import Optional
from models.bot import Bot
from models.zhipuai.zhipu_ai_session import ZhipuAISession
@@ -149,6 +150,40 @@ class ZHIPUAIBot(Bot, ZhipuAIImage):
else:
return result
def call_vision(self, image_url: str, question: str,
model: Optional[str] = None,
max_tokens: int = 1000) -> dict:
"""Analyze an image using ZhipuAI OpenAI-compatible SDK.
Always uses glm-5v-turbo — the text models (glm-5-turbo etc.) do not support vision.
"""
try:
vision_model = "glm-5v-turbo"
response = self.client.chat.completions.create(
model=vision_model,
max_tokens=max_tokens,
messages=[{
"role": "user",
"content": [
{"type": "text", "text": question},
{"type": "image_url", "image_url": {"url": image_url}},
],
}],
)
content = response.choices[0].message.content or ""
usage = response.usage
return {
"model": vision_model,
"content": content,
"usage": {
"prompt_tokens": getattr(usage, "prompt_tokens", 0),
"completion_tokens": getattr(usage, "completion_tokens", 0),
"total_tokens": getattr(usage, "total_tokens", 0),
},
}
except Exception as e:
logger.error(f"[ZHIPU_AI] call_vision error: {e}")
return {"error": True, "message": str(e)}
def call_with_tools(self, messages, tools=None, stream=False, **kwargs):
"""
Call ZhipuAI API with tool support for agent integration

View File

@@ -157,7 +157,6 @@ class CowCliPlugin(Plugin):
" /config 查看当前配置",
" /config <key> 查看某项配置",
" /config <key> <val> 修改配置",
" /install-browser 安装浏览器工具依赖",
"",
"💡 也可以用 cow <command> 代替 /<command>",
]
@@ -407,7 +406,7 @@ class CowCliPlugin(Plugin):
from common import const
_EXACT = {
"wenxin": const.BAIDU, "wenxin-4": const.BAIDU,
"xunfei": const.XUNFEI, const.QWEN: const.QWEN,
"xunfei": const.XUNFEI, const.QWEN: const.QWEN_DASHSCOPE,
const.MODELSCOPE: const.MODELSCOPE,
const.MOONSHOT: const.MOONSHOT,
"moonshot-v1-8k": const.MOONSHOT, "moonshot-v1-32k": const.MOONSHOT,

View File

@@ -315,7 +315,7 @@ class Godcmd(Plugin):
except Exception as e:
ok, result = False, "你没有设置私有GPT模型"
elif cmd == "reset":
if bottype in [const.OPEN_AI, const.OPENAI, const.CHATGPT, const.CHATGPTONAZURE, const.LINKAI, const.BAIDU, const.XUNFEI, const.QWEN, const.GEMINI, const.ZHIPU_AI, const.CLAUDEAPI]:
if bottype in [const.OPEN_AI, const.OPENAI, const.CHATGPT, const.CHATGPTONAZURE, const.LINKAI, const.BAIDU, const.XUNFEI, const.QWEN, const.QWEN_DASHSCOPE, const.GEMINI, const.ZHIPU_AI, const.CLAUDEAPI]:
bot.sessions.clear_session(session_id)
if Bridge().chat_bots.get(bottype):
Bridge().chat_bots.get(bottype).sessions.clear_session(session_id)
@@ -341,7 +341,7 @@ class Godcmd(Plugin):
ok, result = True, "配置已重载"
elif cmd == "resetall":
if bottype in [const.OPEN_AI, const.OPENAI, const.CHATGPT, const.CHATGPTONAZURE, const.LINKAI,
const.BAIDU, const.XUNFEI, const.QWEN, const.GEMINI, const.ZHIPU_AI, const.MOONSHOT,
const.BAIDU, const.XUNFEI, const.QWEN, const.QWEN_DASHSCOPE, const.GEMINI, const.ZHIPU_AI, const.MOONSHOT,
const.MODELSCOPE]:
channel.cancel_all_session()
bot.sessions.clear_all_session()

View File

@@ -6,7 +6,7 @@ build-backend = "setuptools.build_meta"
name = "cowagent"
version = "1.0.0"
description = "CowAgent - AI Agent on WeChat and more"
requires-python = ">=3.9"
requires-python = ">=3.7"
dependencies = [
"click>=8.0",
"requests>=2.28.2",

View File

@@ -4,8 +4,6 @@ requests>=2.28.2
chardet>=5.1.0
Pillow
web.py
linkai>=0.0.6.0
agentmesh-sdk>=0.1.3
python-dotenv>=1.0.0
PyYAML>=6.0
croniter>=2.0.0

4
run.sh
View File

@@ -271,7 +271,7 @@ select_model() {
echo -e "${YELLOW}2) Zhipu AI (glm-5-turbo, glm-5, etc.)${NC}"
echo -e "${YELLOW}3) Kimi (kimi-k2.5, kimi-k2, etc.)${NC}"
echo -e "${YELLOW}4) Doubao (doubao-seed-2-0-code-preview-260215, etc.)${NC}"
echo -e "${YELLOW}5) Qwen (qwen3.5-plus, qwen3-max, qwq-plus, etc.)${NC}"
echo -e "${YELLOW}5) Qwen (qwen3.6-plus, qwen3.5-plus, qwen3-max, qwq-plus, etc.)${NC}"
echo -e "${YELLOW}6) Claude (claude-sonnet-4-6, claude-opus-4-6, etc.)${NC}"
echo -e "${YELLOW}7) Gemini (gemini-3.1-flash-lite-preview, gemini-3.1-pro-preview, etc.)${NC}"
echo -e "${YELLOW}8) OpenAI GPT (gpt-5.4, gpt-5.2, gpt-4.1, etc.)${NC}"
@@ -318,7 +318,7 @@ configure_model() {
2) read_model_config "Zhipu AI" "glm-5-turbo" "ZHIPU_KEY" ;;
3) read_model_config "Kimi (Moonshot)" "kimi-k2.5" "MOONSHOT_KEY" ;;
4) read_model_config "Doubao (Volcengine Ark)" "doubao-seed-2-0-code-preview-260215" "ARK_KEY" ;;
5) read_model_config "Qwen (DashScope)" "qwen3.5-plus" "DASHSCOPE_KEY" ;;
5) read_model_config "Qwen (DashScope)" "qwen3.6-plus" "DASHSCOPE_KEY" ;;
6)
read_model_config "Claude" "claude-sonnet-4-6" "CLAUDE_KEY"
read_api_base "CLAUDE_BASE" "https://api.anthropic.com/v1"

View File

@@ -154,7 +154,7 @@ $ModelChoices = @{
"2" = @{ Provider = "Zhipu AI"; Default = "glm-5-turbo"; Key = "ZHIPU_KEY" }
"3" = @{ Provider = "Kimi (Moonshot)"; Default = "kimi-k2.5"; Key = "MOONSHOT_KEY" }
"4" = @{ Provider = "Doubao (Volcengine Ark)"; Default = "doubao-seed-2-0-code-preview-260215"; Key = "ARK_KEY" }
"5" = @{ Provider = "Qwen (DashScope)"; Default = "qwen3.5-plus"; Key = "DASHSCOPE_KEY" }
"5" = @{ Provider = "Qwen (DashScope)"; Default = "qwen3.6-plus"; Key = "DASHSCOPE_KEY" }
"6" = @{ Provider = "Claude"; Default = "claude-sonnet-4-6"; Key = "CLAUDE_KEY"; Base = "https://api.anthropic.com/v1" }
"7" = @{ Provider = "Gemini"; Default = "gemini-3.1-pro-preview"; Key = "GEMINI_KEY"; Base = "https://generativelanguage.googleapis.com" }
"8" = @{ Provider = "OpenAI GPT"; Default = "gpt-5.4"; Key = "OPENAI_KEY"; Base = "https://api.openai.com/v1" }
@@ -169,7 +169,7 @@ function Select-Model {
Write-Host "2) Zhipu AI (glm-5-turbo, glm-5, etc.)"
Write-Host "3) Kimi (kimi-k2.5, kimi-k2, etc.)"
Write-Host "4) Doubao (doubao-seed-2-0-code-preview-260215, etc.)"
Write-Host "5) Qwen (qwen3.5-plus, qwen3-max, qwq-plus, etc.)"
Write-Host "5) Qwen (qwen3.6-plus, qwen3.5-plus, qwen3-max, qwq-plus, etc.)"
Write-Host "6) Claude (claude-sonnet-4-6, claude-opus-4-6, etc.)"
Write-Host "7) Gemini (gemini-3.1-flash-lite-preview, gemini-3.1-pro-preview, etc.)"
Write-Host "8) OpenAI GPT (gpt-5.4, gpt-5.2, gpt-4.1, etc.)"
@@ -453,7 +453,11 @@ function Update-Project {
Assert-Python
Install-Dependencies
Start-CowAgent
# Start via python -m cli.cli instead of cow.exe, because the exe may
# still be cached/locked from the previous installation on Windows.
Write-Cow "Starting CowAgent..."
& $PythonCmd -m cli.cli start
}
# ── main ──────────────────────────────────────────────────────────