Compare commits

...

39 Commits

Author SHA1 Message Date
zhayujie
55aaf60a57 feat: release 2.0.8 2026-05-06 16:19:20 +08:00
zhayujie
a5790d82f6 feat(qianfan): scope vision support to multimodal models 2026-05-06 16:11:10 +08:00
zhayujie
63f99af1e6 Merge pull request #2800 from jimmyzhuu/feat/qianfan-vision-provider
Add Qianfan support to Vision tool
2026-05-06 15:39:12 +08:00
zhayujie
4eed2568aa fix(bash): reduce safety check false positives 2026-05-06 15:36:44 +08:00
jimmyzhuu
fb7962c7f2 fix: use available qianfan vision model 2026-05-06 13:34:39 +08:00
jimmyzhuu
76e6b7b471 docs: document qianfan vision support 2026-05-06 13:28:46 +08:00
jimmyzhuu
fccb7ff9ed feat: route qianfan vision provider 2026-05-06 13:25:59 +08:00
jimmyzhuu
3b12ef2e66 feat: add qianfan vision calls 2026-05-06 13:24:41 +08:00
jimmyzhuu
f9d099be1b feat: add qianfan vision model constants 2026-05-06 13:23:04 +08:00
zhayujie
c322c0e3a5 docs(models): add ernie-5.0 2026-05-06 12:15:14 +08:00
zhayujie
530fc20596 Merge pull request #2790 from jimmyzhuu/feat/qianfan-provider
Add first-class Baidu Qianfan / ERNIE provider
2026-05-06 11:43:32 +08:00
zhayujie
a23b4ed754 Merge pull request #2797 from Zmjjeff7/feat-translate-youdao
feat(translate): add Youdao as a new translation provider
2026-05-06 11:28:50 +08:00
zhayujie
fc4f5077b0 fix: update .gitignore 2026-05-06 11:27:57 +08:00
Zmjjeff7
6a553886da feat(translate): add Youdao as a new translation provider
The translate module previously only supported Baidu translation, and the
factory raised a bare RuntimeError for any other type. This change adds
Youdao Translation as a second provider and improves the factory's error
message.

Implementation details:
- New YoudaoTranslator class in translate/youdao/youdao_translate.py
- Implements Youdao's v3 SHA-256 signature scheme, including the
  truncate-input rule for queries longer than 20 characters
- Maps ISO 639-1 language codes to Youdao-specific codes
  (zh -> zh-CHS, zh-TW -> zh-CHT, others pass through)
- Differentiates network errors, API error codes, and empty translations
- factory.create_translator now lists the supported types in its
  RuntimeError message instead of failing silently
- Default config exposes youdao_translate_app_key and
  youdao_translate_app_secret

Adds 17 unit tests covering signature correctness, language code mapping,
input truncation edge cases, the full request/response flow, and factory
dispatch. All tests pass under Python 3.11.
2026-05-05 23:58:32 +08:00
zhayujie
1065c7e722 fix(feishu): unblock streaming via async push worker 2026-05-05 19:36:15 +08:00
zhayujie
a9c8a59f58 feat(feishu): one-click QR-scan app creation 2026-05-05 18:32:58 +08:00
zhayujie
8730f7fd27 fix(memory): exclude scheduler-injected pairs from daily memory flush 2026-05-05 16:53:01 +08:00
zhayujie
8f608223d7 perf(feishu): tune streaming render speed 2026-05-05 14:53:30 +08:00
zhayujie
a7cbd47a2f fix(feishu): default feishu_stream_reply to true 2026-05-05 14:30:34 +08:00
zhayujie
b80c3fe5a8 feat(feishu): enhance #2791 with cardkit streaming + ASR fixes
- rewrite streaming reply to official cardkit v2.0 API (default on, auto-fallback)
- fix Whisper hallucination: bump ASR sample rate to 16k, pass language=zh
- fix lock-over-IO and tmp file cleanup from #2791
- drop deprecated feishu_bot_name; quiet unknown-key warnings
- docs: cardkit permission and feishu_stream_reply usage
2026-05-05 14:15:25 +08:00
zhayujie
5080051e39 Merge pull request #2791 from ooaaooaa123/feat/feishu-voice-stream-reply
feat(feishu): 支持语音消息收发与流式打字机回复
2026-05-05 13:10:00 +08:00
zhayujie
23bfc8d0ba fix(feishu): update config-template.json 2026-05-05 13:05:39 +08:00
zhayujie
80e9062041 fix(vision): respect tool.vision.model and add automatic fallback #2792 2026-05-03 22:28:32 +08:00
zhayujie
67bd3420ed perf(scheduler): bound isolated session context to agent_max_context_turns/5 2026-05-03 21:49:59 +08:00
zhayujie
aea081703f fix(scheduler): inject delivered output into receiver session with sliding window
Further refinements on top of #2795:

- persist real session_id (notify_session_id) at task creation so group chats
  correctly map back to the user's actual conversation
- mark scheduler turns with [SCHEDULED] (recognise legacy "Scheduled task"
  prefix too for backward-compatible pruning)
- prune both DB and in-memory to scheduler_inject_max_per_session (default 3),
  only marker-tagged pairs are touched; regular user turns never deleted
- send_message type gated by scheduler_inject_send_message (default false) —
  fixed reminder text rarely benefits follow-up Q&A

Co-authored-by: huangrichao2020 <grdomai43881@gmail.com>
2026-05-03 21:27:24 +08:00
zhayujie
f300d2a2d5 Merge pull request #2795 from huangrichao2020/fix/scheduler-remember-v2
fix: remember scheduled task outputs with correct session mapping (v2)
2026-05-03 21:02:40 +08:00
tingchim2pro
f150d7d83a fix: remember scheduled task outputs in receiver session (v2)
Address review feedback from #2794:

1. Use notify_session_id instead of receiver for correct group chat mapping
   - Task creation should store the real session_id in action.notify_session_id
   - Falls back to receiver for backward compatibility with old tasks

2. Add injection to all four execution branches:
   - _execute_agent_task
   - _execute_send_message
   - _execute_tool_call
   - _execute_skill_call (also fixed missing channel.send)

3. Add config switch and content truncation:
   - scheduler_inject_to_session (default: true) to toggle the feature
   - 2000 char limit prevents high-frequency tasks from bloating sessions

Fixes #2793
2026-05-02 19:00:50 +08:00
ooaaooaa123
4d1f059c0d feat(feishu): add voice message support and streaming text reply
- Receive audio messages: map msg_type=audio to ContextType.VOICE and
    download opus file via lazy _prepare_fn for STT pipeline
  - Send voice replies: upload opus audio via Feishu file API, auto-convert
    non-opus formats (e.g. mp3) using pydub before upload
  - Streaming text reply: inject on_event callback into context; send a
  card
    placeholder on first delta, then PATCH-update it in-place at a
    configurable interval (feishu_stream_interval_ms) to achieve typewriter
    effect; set feishu_streamed=True to suppress duplicate send()
  - Enable NOT_SUPPORT_REPLYTYPE=[] to unblock voice and image reply types
  - Fix AudioSegment mutation bug in audio_convert.py: set_frame_rate /
    set_channels return new objects and must be reassigned
  - Add -nostdin to ffmpeg invocation to prevent stdin deadlock in daemon
  - Add feishu_bot_name, feishu_stream_reply, feishu_stream_interval_ms
    config keys to config-template.json
2026-04-30 16:14:57 +08:00
jimmyzhuu
bc7f953fcc docs: add qianfan provider guide 2026-04-29 16:41:25 +08:00
jimmyzhuu
f653483eea feat: expose qianfan in configuration surfaces 2026-04-29 16:32:53 +08:00
jimmyzhuu
6b200fd36b fix: handle qianfan error responses 2026-04-29 16:24:37 +08:00
jimmyzhuu
161fc6cdf0 feat: add qianfan chat bot 2026-04-29 16:19:27 +08:00
jimmyzhuu
6f68ed6bce test: restore cow cli parent module attribute 2026-04-29 16:12:08 +08:00
jimmyzhuu
a4592ffdfe test: isolate cow cli plugin import 2026-04-29 16:08:40 +08:00
jimmyzhuu
7cd7bd1a48 fix: avoid cow cli import side effects 2026-04-29 16:04:48 +08:00
jimmyzhuu
9eeca70292 feat: register qianfan model provider 2026-04-29 15:52:32 +08:00
zhayujie
02bfe30848 fix(memory): prevent duplicate Deep Dream runs 2026-04-28 15:30:51 +08:00
zhayujie
c9c99de3d9 fix(bash): scope safety confirm to destructive deletions outside workspace 2026-04-28 10:18:47 +08:00
zhayujie
8752f0cc60 refactor(openai): drop SDK dependency and switch to native HTTP client 2026-04-27 20:21:54 +08:00
60 changed files with 4976 additions and 617 deletions

View File

@@ -70,6 +70,8 @@
# 🏷 更新日志
>**2026.05.06** [2.0.8版本](https://github.com/zhayujie/CowAgent/releases/tag/2.0.8)飞书渠道全面升级语音、流式输出和Markdown、一键扫码接入、新模型支持DeepSeek V4、百度千帆、定时任务工具增强等
>**2026.04.22** [2.0.7版本](https://github.com/zhayujie/CowAgent/releases/tag/2.0.7)图像生成内置技能GPT Image 2、Nano Banana 等、新模型支持Kimi K2.6、Claude Opus 4.7、GLM 5.1、知识库和记忆增强、Web 控制台优化
>**2026.04.14** [2.0.6版本](https://github.com/zhayujie/CowAgent/releases/tag/2.0.6)知识库系统、梦境记忆模块、上下文智能压缩、Web 控制台多会话及多项优化。
@@ -115,7 +117,7 @@ irm https://cdn.link-ai.tech/code/cow/run.ps1 | iex
项目支持国内外主流厂商的模型接口,可选模型及配置说明参考:[模型说明](#模型说明)。
> Agent 模式下推荐使用以下模型可根据效果及成本综合选择deepseek-v4-flash、MiniMax-M2.7、glm-5.1、kimi-k2.6、qwen3.5-plus、claude-sonnet-4-6、gemini-3.1-pro-preview、gpt-5.4、gpt-5.4-mini
> Agent 模式下推荐使用以下模型可根据效果及成本综合选择deepseek-v4-flash、MiniMax-M2.7、glm-5.1、kimi-k2.6、qwen3.5-plus、claude-sonnet-4-6、gemini-3.1-pro-preview、gpt-5.4、gpt-5.4-mini、ernie-5.0
同时支持使用 **LinkAI 平台** 接口,支持上述全部模型,并支持知识库、工作流、插件等 Agent 技能,参考 [接口文档](https://docs.link-ai.tech/platform/api)。
@@ -597,33 +599,35 @@ API Key 创建:在 [控制台](https://aistudio.google.com/app/apikey?hl=zh-cn
</details>
<details>
<summary>百度文心</summary>
方式一:官方 SDK 接入,配置如下:
<summary>百度千帆 / ERNIE</summary>
方式一:官方接入(推荐),配置如下:
```json
{
"model": "wenxin-4",
"baidu_wenxin_api_key": "IajztZ0bDxgnP9bEykU7lBer",
"baidu_wenxin_secret_key": "EDPZn6L24uAS9d8RWFfotK47dPvkjD6G"
"model": "ernie-5.0",
"qianfan_api_key": "",
"qianfan_api_base": "https://qianfan.baidubce.com/v2"
}
```
- `model`: 可填 `wenxin``wenxin-4`,对应模型为 文心-3.5 和 文心-4.0
- `baidu_wenxin_api_key`:参考 [千帆平台-access_token鉴权](https://cloud.baidu.com/doc/WENXINWORKSHOP/s/dlv4pct3s) 文档获取 API Key
- `baidu_wenxin_secret_key`:参考 [千帆平台-access_token鉴权](https://cloud.baidu.com/doc/WENXINWORKSHOP/s/dlv4pct3s) 文档获取 Secret Key
- `model`: 默认推荐填写 `ernie-5.0`(多模态,可直接识图),也可填写 `ernie-x1.1``ernie-4.5-turbo-128k``ernie-4.5-turbo-32k`;当主模型为纯文本 ERNIE 时Vision 工具会自动 fallback 到 `ernie-4.5-turbo-vl`
- `qianfan_api_key`: 百度千帆 API Key通常以 `bce-v3/` 开头,可在百度智能云控制台创建
- `qianfan_api_base`: 可选,默认为 `https://qianfan.baidubce.com/v2`
方式二OpenAI 兼容方式接入,配置如下:
```json
{
"bot_type": "openai",
"model": "ERNIE-4.0-Turbo-8K",
"model": "ernie-5.0",
"open_ai_api_base": "https://qianfan.baidubce.com/v2",
"open_ai_api_key": "bce-v3/ALTxxxxxxd2b"
"open_ai_api_key": ""
}
```
- `bot_type`: OpenAI 兼容方式
- `model`: 支持官方所有模型,参考[模型列表](https://cloud.baidu.com/doc/WENXINWORKSHOP/s/Wm9cvy6rl)
- `open_ai_api_base`: 百度文心 API 的 BASE URL
- `open_ai_api_key`: 百度文心的 API-KEY参考 [官方文档](https://cloud.baidu.com/doc/qianfan-api/s/ym9chdsy5) ,在 [控制台](https://console.bce.baidu.com/iam/#/iam/apikey/list) 创建 API Key
- `model`: 支持千帆平台上的 ERNIE 模型
- `open_ai_api_base`: 百度千帆 OpenAI 兼容 API 的 BASE URL
- `open_ai_api_key`: 百度千帆 API Key
</details>
@@ -724,36 +728,26 @@ Coding Plan 是各厂商推出的编程包月套餐,所有厂商均可通过 O
<details>
<summary>3. Feishu - 飞书</summary>
飞书支持两种事件接收模式:WebSocket 长连接(推荐)和 Webhook
飞书使用 WebSocket 长连接模式,无需公网 IP。详细步骤参考 [飞书接入](https://docs.cowagent.ai/channels/feishu)
**方式一:WebSocket 模式(推荐,无需公网 IP**
**方式一:扫码一键创建(推荐**
启动 Cow 后打开 Web 控制台,**通道** → **接入通道** → 选择 **飞书** → 扫码创建。也支持 CLI 启动时在终端打印二维码。
**方式二:手动配置**
在飞书开放平台创建自建应用并配置权限后,将凭据填入 `config.json`
```json
{
"channel_type": "feishu",
"feishu_app_id": "APP_ID",
"feishu_app_secret": "APP_SECRET",
"feishu_event_mode": "websocket"
"feishu_stream_reply": true
}
```
**方式二Webhook 模式(需要公网 IP**
```json
{
"channel_type": "feishu",
"feishu_app_id": "APP_ID",
"feishu_app_secret": "APP_SECRET",
"feishu_token": "VERIFICATION_TOKEN",
"feishu_event_mode": "webhook",
"feishu_port": 9891
}
```
- `feishu_event_mode`: 事件接收模式,`websocket`(推荐)或 `webhook`
- WebSocket 模式需安装依赖:`pip3 install lark-oapi`
详细步骤和参数说明参考 [飞书接入](https://docs.cowagent.ai/channels/feishu)
- `feishu_stream_reply`:是否开启流式打字机回复,默认开启(需 `cardkit:card:write` 权限 + 飞书客户端 ≥ 7.20
</details>
@@ -775,7 +769,15 @@ Coding Plan 是各厂商推出的编程包月套餐,所有厂商均可通过 O
<details>
<summary>5. WeCom Bot - 企微智能机器人</summary>
企微智能机器人使用 WebSocket 长连接模式,无需公网 IP 和域名,配置简单:
企微智能机器人使用 WebSocket 长连接模式,无需公网 IP 和域名。详细步骤参考 [企微智能机器人接入](https://docs.cowagent.ai/channels/wecom-bot)。
**方式一:扫码一键创建(推荐)**
启动 Cow 后打开 Web 控制台,**通道** → **接入通道** → 选择 **企微智能机器人** → 使用企业微信扫码创建。
**方式二:手动配置**
在企业微信中创建智能机器人并选择**长连接模式**,记录 Bot ID 和 Secret 后填入 `config.json`
```json
{
@@ -784,7 +786,6 @@ Coding Plan 是各厂商推出的编程包月套餐,所有厂商均可通过 O
"wecom_bot_secret": "YOUR_SECRET"
}
```
详细步骤和参数说明参考 [企微智能机器人接入](https://docs.cowagent.ai/channels/wecom-bot)
</details>

View File

@@ -499,6 +499,107 @@ class ConversationStore:
finally:
conn.close()
def prune_scheduled_messages(
self,
session_id: str,
keep_last_n: int,
markers: Optional[List[str]] = None,
) -> int:
"""
Keep at most ``keep_last_n`` scheduler-injected user/assistant pairs in
the session, deleting the older ones.
A scheduler-injected pair is identified by a user message whose first
text block starts with one of ``markers``; the immediately following
assistant message (next seq) is treated as its paired output.
Only scheduler-tagged messages are touched; regular user turns are
never deleted. Safe to call repeatedly; no-op if nothing to prune.
Args:
session_id: Session to prune.
keep_last_n: Maximum scheduler pairs to retain (must be >= 0).
markers: Text prefixes that identify scheduler user messages.
Defaults to ``["[SCHEDULED]", "Scheduled task"]`` so that
pairs written by older versions are also recognised.
Returns:
Number of message rows deleted.
"""
if keep_last_n < 0:
keep_last_n = 0
if markers is None:
markers = ["[SCHEDULED]", "Scheduled task"]
def _matches_marker(raw_content: str) -> bool:
try:
parsed = json.loads(raw_content)
except Exception:
parsed = raw_content
text = _extract_display_text(parsed) if not isinstance(parsed, str) else parsed
if not text:
return False
return any(text.startswith(m) for m in markers)
with self._lock:
conn = self._connect()
try:
rows = conn.execute(
"""
SELECT seq, role, content
FROM messages
WHERE session_id = ?
ORDER BY seq ASC
""",
(session_id,),
).fetchall()
# Find scheduler pairs: each is (user_seq, assistant_seq?)
pairs: List[tuple] = [] # list of (user_seq, assistant_seq_or_None)
for idx, (seq, role, raw_content) in enumerate(rows):
if role != "user" or not _matches_marker(raw_content):
continue
assistant_seq = None
# Pair with the very next message if it's an assistant turn.
if idx + 1 < len(rows):
next_seq, next_role, _ = rows[idx + 1]
if next_role == "assistant":
assistant_seq = next_seq
pairs.append((seq, assistant_seq))
if len(pairs) <= keep_last_n:
return 0
to_delete_pairs = pairs[: len(pairs) - keep_last_n]
seqs_to_delete: List[int] = []
for user_seq, assistant_seq in to_delete_pairs:
seqs_to_delete.append(user_seq)
if assistant_seq is not None:
seqs_to_delete.append(assistant_seq)
if not seqs_to_delete:
return 0
placeholders = ",".join("?" * len(seqs_to_delete))
with conn:
conn.execute(
f"DELETE FROM messages WHERE session_id = ? AND seq IN ({placeholders})",
(session_id, *seqs_to_delete),
)
conn.execute(
"""
UPDATE sessions
SET msg_count = (
SELECT COUNT(*) FROM messages WHERE session_id = ?
)
WHERE session_id = ?
""",
(session_id, session_id),
)
return len(seqs_to_delete)
finally:
conn.close()
def cleanup_old_sessions(self, max_age_days: Optional[int] = None) -> int:
"""
Delete sessions that have not been active within max_age_days.

View File

@@ -115,7 +115,7 @@ class MemoryFlushManager:
self.last_flush_timestamp: Optional[datetime] = None
self._trim_flushed_hashes: set = set() # Content hashes of already-flushed messages
self._last_flushed_content_hash: str = "" # Content hash at last flush, for daily dedup
self._last_dream_input_hash: str = "" # Hash of dream input, for dedup
self._last_dream_input_hash: str = "" # "{date}:{daily_hash}" of last dream, for dedup
self._last_flush_thread: Optional[threading.Thread] = None
def get_today_memory_file(self, user_id: Optional[str] = None, ensure_exists: bool = False) -> Path:
@@ -175,6 +175,15 @@ class MemoryFlushManager:
injection.
"""
try:
# Strip scheduler-injected pairs before any further processing.
# These messages already serve as short-term context inside the
# receiver session; promoting them into long-term daily memory
# produces low-value flat logs (e.g. "11:28 price=1013, normal /
# 11:58 price=1013, normal / ...") and wastes summarisation tokens.
messages = self._strip_scheduler_pairs(messages)
if not messages:
return False
import hashlib
deduped = []
for m in messages:
@@ -323,13 +332,18 @@ class MemoryFlushManager:
logger.info("[DeepDream] No recent daily records, skipping to preserve existing MEMORY.md")
return False
# Dedup: skip if input materials haven't changed since last dream
# Dedup: skip if same daily content already dreamed today.
# Note: only hash daily_content (not memory_content), because deep_dream
# itself rewrites MEMORY.md as a side effect, which would otherwise
# invalidate the hash on every subsequent call within the same window.
import hashlib
input_hash = hashlib.md5((memory_content + daily_content).encode("utf-8")).hexdigest()
if not force and input_hash == self._last_dream_input_hash:
logger.debug("[DeepDream] Input unchanged since last dream, skipping")
daily_hash = hashlib.md5(daily_content.encode("utf-8")).hexdigest()
today_str = datetime.now().strftime("%Y-%m-%d")
dedup_key = f"{today_str}:{daily_hash}"
if not force and dedup_key == self._last_dream_input_hash:
logger.info("[DeepDream] Already dreamed today with same daily content, skipping")
return False
self._last_dream_input_hash = input_hash
self._last_dream_input_hash = dedup_key
logger.info(
f"[DeepDream] Materials collected: "
@@ -642,6 +656,40 @@ class MemoryFlushManager:
return "\n".join(parts)
return ""
@classmethod
def _strip_scheduler_pairs(cls, messages: List[Dict]) -> List[Dict]:
"""Drop scheduler-injected user/assistant pairs from a flush batch.
A scheduler user message starts with the ``[SCHEDULED]`` marker
(written by ``AgentBridge.remember_scheduled_output``); the message
immediately following it (if it is an assistant turn) is its paired
output and is dropped together. Regular user/assistant turns and
any tool_use / tool_result blocks are preserved as-is.
"""
if not messages:
return messages
SCHEDULED_PREFIX = "[SCHEDULED]"
result = []
skip_next_assistant = False
for msg in messages:
if not isinstance(msg, dict):
result.append(msg)
skip_next_assistant = False
continue
role = msg.get("role")
if skip_next_assistant and role == "assistant":
skip_next_assistant = False
continue
skip_next_assistant = False
if role == "user":
text = cls._extract_text_from_content(msg.get("content", ""))
if text.lstrip().startswith(SCHEDULED_PREFIX):
skip_next_assistant = True
continue
result.append(msg)
return result
def create_memory_files_if_needed(workspace_dir: Path, user_id: Optional[str] = None):
"""

View File

@@ -29,7 +29,7 @@ ENVIRONMENT: All API keys from env_config are auto-injected. Use $VAR_NAME direc
SAFETY:
- Freely create/modify/delete files within the workspace
- For destructive and out-of-workspace commands, explain and confirm first"""
- For destructive commands out of workspace, explain and confirm first"""
params: dict = {
"type": "object",
@@ -238,48 +238,43 @@ SAFETY:
def _get_safety_warning(self, command: str) -> str:
"""
Get safety warning for potentially dangerous commands
Only warns about extremely dangerous system-level operations
Get safety warning for absolutely catastrophic commands only.
Keep the blocklist minimal so the agent retains maximum freedom.
:param command: Command to check
:return: Warning message if dangerous, empty string if safe
"""
cmd_lower = command.lower().strip()
# Tokenize to avoid substring false positives (e.g. `rm -rf /tmp/x`
# must not match `rm -rf /`).
tokens = command.lower().split()
# Only block extremely dangerous system operations
dangerous_patterns = [
# System shutdown/reboot
("shutdown", "This command will shut down the system"),
("reboot", "This command will reboot the system"),
("halt", "This command will halt the system"),
("poweroff", "This command will power off the system"),
# `rm -rf /` or `rm -rf /*` targeting the real root.
for i, tok in enumerate(tokens):
if tok != "rm":
continue
has_rf = False
for j in range(i + 1, len(tokens)):
t = tokens[j]
if t.startswith("-") and "r" in t and "f" in t:
has_rf = True
elif t in ("--recursive", "--force"):
continue
elif t in ("/", "/*"):
if has_rf:
return "This command will delete the entire filesystem"
break
else:
break
# Critical system modifications
("rm -rf /", "This command will delete the entire filesystem"),
("rm -rf /*", "This command will delete the entire filesystem"),
("dd if=/dev/zero", "This command can destroy disk data"),
("mkfs", "This command will format a filesystem, destroying all data"),
("fdisk", "This command modifies disk partitions"),
# Disk wiping
if "if=/dev/zero" in command.lower() and "dd " in command.lower():
return "This command can destroy disk data"
# User/system management (only if targeting system users)
("userdel root", "This command will delete the root user"),
("passwd root", "This command will change the root password"),
]
# Power control - match only as a standalone word (\b enforces word boundary)
if re.search(r'\b(shutdown|reboot|halt|poweroff)\b', command.lower()):
return "This command will shut down or restart the system"
for pattern, warning in dangerous_patterns:
if pattern in cmd_lower:
return warning
# Check for recursive deletion outside workspace
if "rm" in cmd_lower and "-rf" in cmd_lower:
# Allow deletion within current workspace
if not any(path in cmd_lower for path in ["./", self.cwd.lower()]):
# Check if targeting system directories
system_dirs = ["/bin", "/usr", "/etc", "/var", "/home", "/root", "/sys", "/proc"]
if any(sysdir in cmd_lower for sysdir in system_dirs):
return "This command will recursively delete system directories"
return "" # No warning needed
return ""
@staticmethod
def _convert_env_vars_for_windows(command: str, dotenv_vars: dict) -> str:

View File

@@ -84,6 +84,49 @@ def get_scheduler_service():
return _scheduler_service
def _remember_delivered_output(
agent_bridge,
task: dict,
channel_type: str,
content: str,
) -> None:
"""Best-effort persistence of the message the scheduler sent to a user.
Uses notify_session_id (the real chat session_id stored at task creation time)
so that group chats correctly associate the output with the user's conversation.
Falls back to receiver for backward compatibility with old tasks.
Per-action-type behaviour:
- agent_task / tool_call / skill_call: gated by ``scheduler_inject_to_session``
(default True). These produce AI-generated content worth remembering.
- send_message: additionally gated by ``scheduler_inject_send_message``
(default False). Fixed reminder text rarely benefits follow-up Q&A and
would just consume context tokens.
"""
if not content:
return
action = task.get("action", {})
action_type = action.get("type", "")
# send_message defaults to NOT being injected; explicit opt-in via config.
if action_type == "send_message":
if not conf().get("scheduler_inject_send_message", False):
return
session_id = action.get("notify_session_id") or action.get("receiver")
if not session_id:
return
try:
remember = getattr(agent_bridge, "remember_scheduled_output", None)
if remember:
task_desc = action.get("task_description") or action.get("content", "")
remember(session_id, str(content), channel_type=channel_type, task_description=task_desc)
except Exception as e:
logger.warning(
f"[Scheduler] Failed to remember delivered output for {session_id}: {e}"
)
def _execute_agent_task(task: dict, agent_bridge):
"""
Execute an agent_task action - let Agent handle the task
@@ -165,6 +208,7 @@ def _execute_agent_task(task: dict, agent_bridge):
# Send the reply
channel.send(reply, context)
_remember_delivered_output(agent_bridge, task, channel_type, reply.content)
logger.info(f"[Scheduler] Task {task['id']} executed successfully, result sent to {receiver}")
else:
logger.error(f"[Scheduler] Failed to create channel: {channel_type}")
@@ -255,6 +299,7 @@ def _execute_send_message(task: dict, agent_bridge):
logger.debug(f"[Scheduler] Registered request_id {request_id} -> session {receiver}")
channel.send(reply, context)
_remember_delivered_output(agent_bridge, task, channel_type, content)
logger.info(f"[Scheduler] Task {task['id']} executed: sent message to {receiver}")
else:
logger.error(f"[Scheduler] Failed to create channel: {channel_type}")
@@ -351,6 +396,7 @@ def _execute_tool_call(task: dict, agent_bridge):
logger.debug(f"[Scheduler] Registered request_id {request_id} -> session {receiver}")
channel.send(reply, context)
_remember_delivered_output(agent_bridge, task, channel_type, content)
logger.info(f"[Scheduler] Task {task['id']} executed: sent tool result to {receiver}")
else:
logger.error(f"[Scheduler] Failed to create channel: {channel_type}")
@@ -429,6 +475,24 @@ def _execute_skill_call(task: dict, agent_bridge):
if result_prefix:
content = f"{result_prefix}\n\n{content}"
# Send the result via channel
from channel.channel_factory import create_channel
try:
channel = create_channel(channel_type)
if channel:
# For web channel, register request_id
if channel_type == "web" and hasattr(channel, 'request_to_session'):
req_id = context.get("request_id")
if req_id:
channel.request_to_session[req_id] = receiver
logger.debug(f"[Scheduler] Registered request_id {req_id} -> session {receiver}")
channel.send(Reply(ReplyType.TEXT, content), context)
_remember_delivered_output(agent_bridge, task, channel_type, content)
except Exception as e:
logger.error(f"[Scheduler] Failed to send skill result: {e}")
logger.info(f"[Scheduler] Task {task['id']} executed: skill result sent to {receiver}")
else:
logger.error(f"[Scheduler] Task {task['id']}: No result from skill execution")

View File

@@ -158,6 +158,11 @@ class SchedulerTool(BaseTool):
# Create task
task_id = str(uuid.uuid4())[:8]
# Capture the real chat session_id at task creation time so that scheduler
# can later inject the delivered output into the user's actual conversation
# (in group chats, session_id != receiver, e.g. "user_id:group_id" on feishu).
notify_session_id = context.get("session_id")
# Build action based on message or ai_task
if message:
action = {
@@ -166,7 +171,8 @@ class SchedulerTool(BaseTool):
"receiver": context.get("receiver"),
"receiver_name": self._get_receiver_name(context),
"is_group": context.get("isgroup", False),
"channel_type": self.config.get("channel_type", "unknown")
"channel_type": self.config.get("channel_type", "unknown"),
"notify_session_id": notify_session_id,
}
else: # ai_task
action = {
@@ -175,7 +181,8 @@ class SchedulerTool(BaseTool):
"receiver": context.get("receiver"),
"receiver_name": self._get_receiver_name(context),
"is_group": context.get("isgroup", False),
"channel_type": self.config.get("channel_type", "unknown")
"channel_type": self.config.get("channel_type", "unknown"),
"notify_session_id": notify_session_id,
}
# 针对钉钉单聊,额外存储 sender_staff_id

View File

@@ -2,12 +2,18 @@
Vision tool - Analyze images using Vision API.
Supports local files (auto base64-encoded) and HTTP URLs.
Provider priority (default):
1. Main model via bot.call_vision — zero extra cost
2. Other models whose API key is configured — auto-discovered
3. OpenAI / LinkAI raw HTTP — reliable fallback
When use_linkai=true, LinkAI is promoted to #1.
When tool.vision.model is set, that model is used exclusively first.
Provider resolution:
- tool.vision.model (if set) means "prefer this model first; fall back to
other configured providers if it fails". The model name is mapped to its
native provider (e.g. doubao-* → Doubao, kimi-* → Moonshot, gpt-* →
OpenAI/LinkAI). That provider is tried first, then the standard auto
chain runs as fallback (with the preferred provider de-duplicated).
- Auto chain priority:
1. Main model via bot.call_vision — only when the main bot is known
to actually support vision (not just expose a call_vision method).
2. Other models whose API key is configured.
3. OpenAI / LinkAI raw HTTP.
When use_linkai=true, LinkAI is promoted to #1.
"""
import base64
@@ -48,10 +54,30 @@ _DISCOVERABLE_MODELS = [
("dashscope_api_key", const.QWEN_DASHSCOPE, const.QWEN36_PLUS, "DashScope"),
("claude_api_key", const.CLAUDEAPI, const.CLAUDE_4_6_SONNET, "Claude"),
("gemini_api_key", const.GEMINI, const.GEMINI_31_FLASH_LITE_PRE, "Gemini"),
("qianfan_api_key", const.QIANFAN, const.ERNIE_45_TURBO_VL, "Qianfan"),
("zhipu_ai_api_key", const.ZHIPU_AI, const.GLM_4_7, "ZhipuAI"),
("minimax_api_key", const.MiniMax, const.MINIMAX_M2_7, "MiniMax"),
]
# Model name prefix → discoverable provider display_name.
# Used to auto-route tool.vision.model to its native provider.
# Matched case-insensitively; longest prefix wins.
_MODEL_PREFIX_TO_PROVIDER = [
("doubao-", "Doubao"),
("kimi-", "Moonshot"),
("moonshot-", "Moonshot"),
("qwen", "DashScope"), # qwen-*, qwen3-*, qwen3.6-*, etc.
("claude-", "Claude"),
("ernie-", "Qianfan"),
("gemini-", "Gemini"),
("glm-", "ZhipuAI"),
("minimax-", "MiniMax"),
("abab", "MiniMax"),
]
# Model prefixes that natively belong to OpenAI / LinkAI (raw HTTP providers).
_OPENAI_MODEL_PREFIXES = ("gpt-", "o1-", "o3-", "o4-", "chatgpt-")
@dataclass
class VisionProvider:
@@ -116,7 +142,7 @@ class Vision(BaseTool):
"Error: No model available for Vision.\n"
"The main model does not support vision and no other API keys are configured.\n"
"Options:\n"
" 1. Switch to a multimodal model (e.g. qwen3.6-plus, claude-sonnet-4-6, gemini-2.0-flash)\n"
" 1. Switch to a multimodal model (e.g. ernie-4.5-turbo-vl, qwen3.6-plus, claude-sonnet-4-6, gemini-2.0-flash)\n"
" 2. Configure OPENAI_API_KEY: env_config(action=\"set\", key=\"OPENAI_API_KEY\", value=\"your-key\")\n"
" 3. Configure LINKAI_API_KEY: env_config(action=\"set\", key=\"LINKAI_API_KEY\", value=\"your-key\")"
)
@@ -126,6 +152,9 @@ class Vision(BaseTool):
except Exception as e:
return ToolResult.fail(f"Error: {e}")
# Default model is only used as a last-resort placeholder for providers
# whose VisionProvider.model_override is None (e.g. raw OpenAI provider
# when the user did not configure tool.vision.model).
return self._call_with_fallback(providers, DEFAULT_MODEL, question, image_content)
def _call_with_fallback(self, providers: List[VisionProvider], model: str,
@@ -162,29 +191,55 @@ class Vision(BaseTool):
def _resolve_providers(self) -> List[VisionProvider]:
"""
Build an ordered list of available providers.
Build an ordered list of providers to try.
Priority:
- use_linkai=true → [LinkAI, MainModel, OtherModels…, OpenAI]
- default → [MainModel, OtherModels…, OpenAI, LinkAI]
Semantics of `tool.vision.model`:
"Prefer this model first; fall back to other configured providers
if it fails."
"OtherModels" are auto-discovered from configured API keys.
The main model's bot_type is excluded from OtherModels to avoid
duplicating the MainModel provider.
Order:
1. The provider that natively serves `tool.vision.model` (if any
and its API key is configured) — using the user-specified model
name verbatim.
2. Auto-discovery chain as fallback:
- use_linkai=true → [LinkAI, MainModel?, OtherModels…, OpenAI]
- default → [MainModel?, OtherModels…, OpenAI, LinkAI]
MainModel is only included when the main bot is known to support
vision (see _main_bot_supports_vision).
Providers that share the same display name as the preferred provider
are de-duplicated to avoid retrying the same endpoint twice.
"""
use_linkai = conf().get("use_linkai", False) and conf().get("linkai_api_key")
user_model = self._resolve_user_vision_model()
providers: List[VisionProvider] = []
# Step 1: preferred provider derived from tool.vision.model
if user_model:
preferred = self._route_by_model_name(user_model)
if preferred:
providers.extend(preferred)
# Step 2: auto-discovery chain as fallback
existing = {p.name for p in providers}
fallback: List[VisionProvider] = []
use_linkai = conf().get("use_linkai", False) and conf().get("linkai_api_key")
if use_linkai:
self._append_provider(providers, self._build_linkai_provider)
self._append_provider(providers, self._build_main_model_provider)
self._append_other_model_providers(providers)
self._append_provider(providers, self._build_openai_provider)
self._append_provider(fallback, lambda: self._build_linkai_provider(user_model))
self._append_provider(fallback, self._build_main_model_provider)
self._append_other_model_providers(fallback, preferred_model=user_model)
self._append_provider(fallback, lambda: self._build_openai_provider(user_model))
else:
self._append_provider(providers, self._build_main_model_provider)
self._append_other_model_providers(providers)
self._append_provider(providers, self._build_openai_provider)
self._append_provider(providers, self._build_linkai_provider)
self._append_provider(fallback, self._build_main_model_provider)
self._append_other_model_providers(fallback, preferred_model=user_model)
self._append_provider(fallback, lambda: self._build_openai_provider(user_model))
self._append_provider(fallback, lambda: self._build_linkai_provider(user_model))
for p in fallback:
if p.name in existing:
continue
providers.append(p)
existing.add(p.name)
return providers
@@ -194,29 +249,135 @@ class Vision(BaseTool):
if p:
providers.append(p)
def _append_other_model_providers(self, providers: List[VisionProvider]) -> None:
@staticmethod
def _resolve_user_vision_model() -> Optional[str]:
"""Read tool.vision.model from config; return None if unset/blank."""
tool_conf = conf().get("tool", {})
if not isinstance(tool_conf, dict):
return None
vision_conf = tool_conf.get("vision", {})
if not isinstance(vision_conf, dict):
return None
m = vision_conf.get("model")
if isinstance(m, str) and m.strip():
return m.strip()
return None
@staticmethod
def _infer_provider_from_model(model_name: str) -> Optional[str]:
"""
Infer the provider display name from a model name's prefix.
Returns None when no rule matches (or for OpenAI-family names, which
are handled separately by the caller).
"""
if not model_name:
return None
lower = model_name.lower()
# Sort by prefix length desc so e.g. "moonshot-" wins over hypothetical "moo-"
for prefix, display_name in sorted(_MODEL_PREFIX_TO_PROVIDER, key=lambda x: -len(x[0])):
if lower.startswith(prefix.lower()):
return display_name
return None
def _route_by_model_name(self, user_model: str) -> Optional[List[VisionProvider]]:
"""
Try to build a provider list using the user-specified model name.
Returns:
- [provider] : matched and the provider's key is configured
- [] : matched but key missing → tell caller to surface this
as a hard error rather than silently falling back
- None : no rule matches → caller should fall through to auto
"""
lower = user_model.lower()
# OpenAI / LinkAI family
if lower.startswith(_OPENAI_MODEL_PREFIXES):
providers: List[VisionProvider] = []
# Prefer LinkAI when explicitly enabled, else OpenAI first
use_linkai = conf().get("use_linkai", False) and conf().get("linkai_api_key")
if use_linkai:
self._append_provider(providers, lambda: self._build_linkai_provider(user_model))
self._append_provider(providers, lambda: self._build_openai_provider(user_model))
else:
self._append_provider(providers, lambda: self._build_openai_provider(user_model))
self._append_provider(providers, lambda: self._build_linkai_provider(user_model))
if providers:
return providers
logger.warning(f"[Vision] tool.vision.model='{user_model}' looks like an OpenAI "
f"model but neither OPENAI_API_KEY nor LINKAI_API_KEY is configured.")
return None # fall through to auto
# Discoverable native providers (Doubao, Moonshot, etc.)
target_display = self._infer_provider_from_model(user_model)
if not target_display:
return None # unknown prefix → auto
for config_key, bot_type, _default_model, display_name in _DISCOVERABLE_MODELS:
if display_name != target_display:
continue
api_key = conf().get(config_key, "")
if not api_key or not api_key.strip():
logger.warning(f"[Vision] tool.vision.model='{user_model}' routes to "
f"'{display_name}' but '{config_key}' is not configured. "
f"Falling back to auto-discovery.")
return None # fall through to auto
try:
from models.bot_factory import create_bot
bot = create_bot(bot_type)
if not hasattr(bot, 'call_vision'):
logger.warning(f"[Vision] '{display_name}' bot does not implement call_vision.")
return None
except Exception as e:
logger.warning(f"[Vision] Failed to create '{display_name}' bot: {e}")
return None
return [VisionProvider(
name=display_name,
api_key="",
api_base="",
model_override=user_model,
use_bot=True,
fallback_bot=bot,
)]
return None
def _append_other_model_providers(self, providers: List[VisionProvider],
preferred_model: Optional[str] = None) -> None:
"""
Auto-discover other models whose API key is configured.
Skip the main model's own bot_type (already covered by MainModel provider).
Skip bot_types that already have a provider in the list (e.g. OpenAI).
Skip the main model's own bot_type (already covered by MainModel
provider), unless the main model itself does not support vision —
in that case we still want the vendor's dedicated vision model
as a fallback. Also skip bot_types that already appear in the
provider list.
If preferred_model matches a provider's family, use it instead
of that provider's hard-coded default model.
"""
# Determine main model's bot_type so we can skip it
main_bot_type = None
main_bot_supports_vision = False
if self.model and hasattr(self.model, '_resolve_bot_type'):
main_bot_type = self.model._resolve_bot_type(conf().get("model", ""))
main_bot = getattr(self.model, "bot", None)
main_bot_supports_vision = self._main_bot_supports_vision(main_bot)
existing_names = {p.name for p in providers}
preferred_provider = self._infer_provider_from_model(preferred_model) if preferred_model else None
for config_key, bot_type, default_model, display_name in _DISCOVERABLE_MODELS:
if display_name in existing_names:
continue
if bot_type == main_bot_type:
# Same bot_type as the main model is normally handled by the
# MainModel provider; only skip it here if the main model
# actually supports vision. Otherwise fall through and add
# the vendor's dedicated vision model as a fallback.
if bot_type == main_bot_type and main_bot_supports_vision:
continue
api_key = conf().get(config_key, "")
if not api_key or not api_key.strip():
continue
# Create a bot instance and check if it supports call_vision
try:
from models.bot_factory import create_bot
bot = create_bot(bot_type)
@@ -225,62 +386,105 @@ class Vision(BaseTool):
except Exception:
continue
providers.append(VisionProvider(
model_for_provider = (preferred_model
if preferred_provider == display_name and preferred_model
else default_model)
provider = VisionProvider(
name=display_name,
api_key="",
api_base="",
model_override=default_model,
model_override=model_for_provider,
use_bot=True,
fallback_bot=bot,
))
)
def _resolve_vision_model(self) -> Optional[str]:
"""
Determine which model to use for vision.
# Same vendor as the main bot is the most natural fallback when
# the main model itself does not support vision — promote it to
# the front of the list instead of relying on declaration order.
if bot_type == main_bot_type:
providers.insert(0, provider)
else:
providers.append(provider)
1. User explicit config: tool.vision.model in config.json
2. Fallback to the main configured model name
def _main_bot_supports_vision(self, bot) -> bool:
"""
tool_conf = conf().get("tool", {})
user_vision_model = tool_conf.get("vision", {}).get("model") if isinstance(tool_conf, dict) else None
if user_vision_model:
return user_vision_model
model_name = conf().get("model", "")
return model_name or None
Whether the main bot is known to natively support vision.
Having a `call_vision` method is necessary but not sufficient —
some bots implement the method against an endpoint that does not
actually serve vision models, which causes silent failures when a
vendor-foreign model name is forwarded.
Resolution order:
1. If the bot explicitly declares `supports_vision`, trust it.
This lets bots opt in or out based on their own runtime
configuration (e.g. the currently selected model).
2. Otherwise, fall back to a model-name prefix heuristic: trust
call_vision when the main model looks like an OpenAI family
model or matches a known multimodal vendor prefix.
"""
if bot is None:
return False
if hasattr(bot, "supports_vision"):
return bool(getattr(bot, "supports_vision"))
main_model = (conf().get("model") or "").lower()
if not main_model:
return False
if main_model.startswith(_OPENAI_MODEL_PREFIXES):
return True
return self._infer_provider_from_model(main_model) is not None
def _build_main_model_provider(self) -> Optional[VisionProvider]:
"""
Use the vendor's own model for vision via bot.call_vision.
Only available when the bot class has call_vision.
Gated by _main_bot_supports_vision so non-vision bots (DeepSeek, etc.)
do not get routed vendor-foreign model names.
"""
if not (self.model and hasattr(self.model, 'bot')):
return None
try:
bot = self.model.bot
if not hasattr(bot, 'call_vision'):
return None
except Exception:
return None
if not hasattr(bot, 'call_vision'):
return None
if not self._main_bot_supports_vision(bot):
return None
vision_model = self._resolve_vision_model()
# Use the configured main model name; do NOT inject tool.vision.model
# here, because by the time we reach this branch the tool.vision.model
# routing has already been attempted (and either matched the main bot
# or failed to find a provider).
main_model_name = conf().get("model") or None
return VisionProvider(
name=_MAIN_MODEL_PROVIDER_NAME,
api_key="",
api_base="",
model_override=vision_model,
model_override=main_model_name,
use_bot=True,
)
def _build_openai_provider(self) -> Optional[VisionProvider]:
def _build_openai_provider(self, preferred_model: Optional[str] = None) -> Optional[VisionProvider]:
api_key = conf().get("open_ai_api_key") or os.environ.get("OPENAI_API_KEY")
if not api_key:
return None
api_base = (conf().get("open_ai_api_base") or os.environ.get("OPENAI_API_BASE", "")).rstrip("/") \
or "https://api.openai.com/v1"
return VisionProvider(name="OpenAI", api_key=api_key, api_base=self._ensure_v1(api_base))
# Only honor preferred_model when it looks like an OpenAI-family name;
# otherwise the OpenAI endpoint would 400 on a vendor-specific name.
model_override = preferred_model if (
preferred_model and preferred_model.lower().startswith(_OPENAI_MODEL_PREFIXES)
) else None
return VisionProvider(
name="OpenAI",
api_key=api_key,
api_base=self._ensure_v1(api_base),
model_override=model_override,
)
def _build_linkai_provider(self) -> Optional[VisionProvider]:
def _build_linkai_provider(self, preferred_model: Optional[str] = None) -> Optional[VisionProvider]:
api_key = conf().get("linkai_api_key") or os.environ.get("LINKAI_API_KEY")
if not api_key:
return None
@@ -290,8 +494,15 @@ class Vision(BaseTool):
extra = get_cloud_headers(api_key)
extra.pop("Authorization", None)
extra.pop("Content-Type", None)
return VisionProvider(name="LinkAI", api_key=api_key, api_base=self._ensure_v1(api_base),
extra_headers=extra)
# LinkAI is a multi-vendor proxy and accepts most model names, so we
# honor any user-configured model name here.
return VisionProvider(
name="LinkAI",
api_key=api_key,
api_base=self._ensure_v1(api_base),
extra_headers=extra,
model_override=preferred_model,
)
def _call_via_bot(self, model: str, question: str, image_content: dict,
provider: Optional[VisionProvider] = None) -> ToolResult:

View File

@@ -14,6 +14,7 @@ from bridge.reply import Reply, ReplyType
from common import const
from common.log import logger
from common.utils import expand_path
from config import conf
from models.openai_compatible_bot import OpenAICompatibleBot
@@ -68,6 +69,7 @@ class AgentLLMModel(LLMModel):
_MODEL_BOT_TYPE_MAP = {
"wenxin": const.BAIDU, "wenxin-4": const.BAIDU,
"xunfei": const.XUNFEI, const.QWEN: const.QWEN_DASHSCOPE,
const.QIANFAN: const.QIANFAN,
const.MODELSCOPE: const.MODELSCOPE,
}
_MODEL_PREFIX_MAP = [
@@ -75,10 +77,10 @@ class AgentLLMModel(LLMModel):
("gemini", const.GEMINI), ("glm", const.ZHIPU_AI), ("claude", const.CLAUDEAPI),
("moonshot", const.MOONSHOT), ("kimi", const.MOONSHOT),
("doubao", const.DOUBAO), ("deepseek", const.DEEPSEEK),
("ernie", const.QIANFAN),
]
def __init__(self, bridge: Bridge, bot_type: str = "chat"):
from config import conf
super().__init__(model=conf().get("model", const.GPT_41))
self.bridge = bridge
self.bot_type = bot_type
@@ -87,7 +89,6 @@ class AgentLLMModel(LLMModel):
@property
def model(self):
from config import conf
return conf().get("model", const.GPT_41)
@model.setter
@@ -96,8 +97,6 @@ class AgentLLMModel(LLMModel):
def _resolve_bot_type(self, model_name: str) -> str:
"""Resolve bot type from model name, matching Bridge.__init__ logic."""
from config import conf
if conf().get("use_linkai", False) and conf().get("linkai_api_key"):
return const.LINKAI
# Support custom bot type configuration
@@ -117,8 +116,9 @@ class AgentLLMModel(LLMModel):
return const.MOONSHOT
if conf().get("bot_type") == "modelscope":
return const.MODELSCOPE
lowered_model = model_name.lower()
for prefix, btype in self._MODEL_PREFIX_MAP:
if model_name.startswith(prefix):
if lowered_model.startswith(prefix):
return btype
return const.OPENAI
@@ -418,6 +418,18 @@ class AgentBridge:
# Store session_id on agent so executor can clear DB on fatal errors
agent._current_session_id = session_id
# Bound the in-memory context for scheduler sessions before each run.
# Scheduler sessions are stable per-task and append every trigger,
# so without trimming they would grow unbounded across runs and
# blow up prompt cost. Regular user chats are not touched here —
# the agent's own context manager handles that path.
if session_id and session_id.startswith("scheduler_"):
from config import conf
scheduler_keep_turns = max(
1, int(conf().get("agent_max_context_turns", 20)) // 5
)
self._trim_in_memory_to_turns(agent, scheduler_keep_turns)
try:
# Use agent's run_stream method with event handler
response = agent.run_stream(
@@ -634,6 +646,196 @@ class AgentBridge:
f"[AgentBridge] Failed to persist messages for session={session_id}: {e}"
)
# Marker used to identify scheduler-injected user messages so we can apply
# a sliding window without touching real user turns. The legacy prefix
# "Scheduled task" (written by the v2 PR) is also recognised when pruning,
# so old data can be aged out instead of leaking forever.
_SCHEDULED_MARKER = "[SCHEDULED]"
_SCHEDULED_LEGACY_MARKERS = ("Scheduled task",)
def remember_scheduled_output(
self,
session_id: str,
content: str,
channel_type: str = "",
task_description: str = "",
) -> None:
"""Add the visible output of a scheduled task to the receiver's session.
Scheduled task execution uses an isolated session so internal planning and
tool calls do not leak into the user's chat. The final message is still
part of the conversation from the user's point of view, so keep a small
visible turn in the receiver session for follow-up questions.
Configuration:
scheduler_inject_to_session (bool, default True):
Master switch. When False, this method is a no-op.
scheduler_inject_max_per_session (int, default 3):
Maximum scheduler-injected user/assistant pairs retained per
session. Older injections are pruned automatically.
Content is truncated to 2000 chars to prevent a single high-volume task
from bloating one entry.
"""
from config import conf
if not conf().get("scheduler_inject_to_session", True):
return
if not session_id or not content:
return
max_len = 2000
if len(content) > max_len:
content = content[:max_len] + "..."
user_text = self._SCHEDULED_MARKER
if task_description:
user_text = f"{self._SCHEDULED_MARKER} {task_description}"
messages = [
{"role": "user", "content": [{"type": "text", "text": user_text}]},
{"role": "assistant", "content": [{"type": "text", "text": content}]},
]
# Persist first so the new pair gets a stable seq, then prune old
# scheduler pairs in DB, then sync the in-memory agent.messages buffer.
self._persist_messages(session_id, messages, channel_type)
keep_last_n = max(int(conf().get("scheduler_inject_max_per_session", 3) or 0), 0)
try:
from agent.memory import get_conversation_store
deleted = get_conversation_store().prune_scheduled_messages(
session_id, keep_last_n=keep_last_n
)
if deleted:
logger.debug(
f"[AgentBridge] Pruned {deleted} old scheduler messages "
f"for session={session_id} (keep_last_n={keep_last_n})"
)
except Exception as e:
logger.warning(
f"[AgentBridge] Failed to prune scheduled messages "
f"for session={session_id}: {e}"
)
agent = self.agents.get(session_id)
if agent:
try:
with agent.messages_lock:
agent.messages.extend(messages)
self._prune_scheduled_in_memory(agent, keep_last_n)
except Exception as e:
logger.warning(
f"[AgentBridge] Failed to update in-memory scheduled output "
f"for session={session_id}: {e}"
)
@staticmethod
def _trim_in_memory_to_turns(agent, keep_turns: int) -> None:
"""Bound ``agent.messages`` to the most recent ``keep_turns`` real
user/assistant turns, dropping older history together with any
intermediate tool_use/tool_result blocks that belonged to it.
A "real" user message is any user message whose content is not solely a
tool_result block — matches the heuristic used elsewhere when filtering
history (see ``AgentInitializer._filter_text_only_messages``).
No-op when the session is already within budget. Caller does not need
to hold the lock; this method acquires it itself.
"""
if keep_turns <= 0:
return
def _is_real_user(msg) -> bool:
if not isinstance(msg, dict) or msg.get("role") != "user":
return False
content = msg.get("content")
if isinstance(content, list):
if any(
isinstance(b, dict) and b.get("type") == "tool_result"
for b in content
):
return False
return any(
isinstance(b, dict) and b.get("type") == "text" and b.get("text")
for b in content
)
if isinstance(content, str):
return bool(content.strip())
return False
with agent.messages_lock:
msgs = agent.messages
real_user_indices = [i for i, m in enumerate(msgs) if _is_real_user(m)]
if len(real_user_indices) <= keep_turns:
return
# Cut at the (k-th from the end) real user message; keep everything
# from there onwards so the surviving slice is still a valid
# user/assistant sequence.
cut_idx = real_user_indices[-keep_turns]
if cut_idx == 0:
return
kept = msgs[cut_idx:]
msgs.clear()
msgs.extend(kept)
logger.debug(
f"[AgentBridge] Trimmed in-memory messages to last "
f"{keep_turns} turns ({len(kept)} messages remain)"
)
@classmethod
def _prune_scheduled_in_memory(cls, agent, keep_last_n: int) -> None:
"""Mirror conversation_store.prune_scheduled_messages on agent.messages.
Caller must hold ``agent.messages_lock``.
"""
if keep_last_n < 0:
keep_last_n = 0
markers = (cls._SCHEDULED_MARKER,) + cls._SCHEDULED_LEGACY_MARKERS
def _is_marker_user(msg) -> bool:
if not isinstance(msg, dict) or msg.get("role") != "user":
return False
content = msg.get("content")
text = ""
if isinstance(content, str):
text = content
elif isinstance(content, list):
for block in content:
if isinstance(block, dict) and block.get("type") == "text":
text = block.get("text", "")
break
return any(text.startswith(m) for m in markers)
msgs = agent.messages
pair_indices = [] # list of (user_idx, assistant_idx_or_None)
for idx, msg in enumerate(msgs):
if not _is_marker_user(msg):
continue
assistant_idx = None
if idx + 1 < len(msgs):
nxt = msgs[idx + 1]
if isinstance(nxt, dict) and nxt.get("role") == "assistant":
assistant_idx = idx + 1
pair_indices.append((idx, assistant_idx))
if len(pair_indices) <= keep_last_n:
return
to_drop = pair_indices[: len(pair_indices) - keep_last_n]
drop_set = set()
for u_idx, a_idx in to_drop:
drop_set.add(u_idx)
if a_idx is not None:
drop_set.add(a_idx)
# Rebuild the list in place to keep external references stable.
kept = [m for i, m in enumerate(msgs) if i not in drop_set]
msgs.clear()
msgs.extend(kept)
@staticmethod
def _strip_thinking_blocks(messages: list) -> list:
"""Return a shallow copy of messages with assistant "thinking" blocks removed."""
@@ -746,4 +948,4 @@ class AgentBridge:
agent.tools = [t for t in agent.tools if t.name != "web_search"]
logger.info("[AgentBridge] web_search tool removed (API key no longer available)")
except Exception as e:
logger.debug(f"[AgentBridge] Failed to refresh conditional tools: {e}")
logger.debug(f"[AgentBridge] Failed to refresh conditional tools: {e}")

View File

@@ -144,7 +144,15 @@ class AgentInitializer:
from agent.memory import get_conversation_store
store = get_conversation_store()
max_turns = conf().get("agent_max_context_turns", 20)
restore_turns = max(3, max_turns // 6)
# Scheduler tasks run on a stable isolated session per task and
# can fire many times a day; a smaller restore window keeps prompt
# cost bounded while still letting the agent see "last few" runs
# for trend / dedup style logic. Regular chat sessions keep the
# original heuristic so user dialogues feel continuous.
if session_id.startswith("scheduler_"):
restore_turns = max(1, max_turns // 5)
else:
restore_turns = max(3, max_turns // 6)
saved = store.load_messages(session_id, max_turns=restore_turns)
if saved:
filtered = self._filter_text_only_messages(saved)
@@ -549,19 +557,22 @@ class AgentInitializer:
def _daily_flush_loop():
import random
last_run_date = None # Track last successful run date to prevent same-day re-trigger
while True:
try:
now = datetime.datetime.now()
jitter_min = random.randint(50, 55)
jitter_sec = random.randint(0, 59)
target = now.replace(hour=23, minute=jitter_min, second=jitter_sec, microsecond=0)
if target <= now:
# Always schedule for tomorrow if we already ran today, or if target time has passed
if target <= now or (last_run_date == now.date()):
target += datetime.timedelta(days=1)
wait_seconds = (target - now).total_seconds()
logger.info(f"[DailyFlush] Next flush at {target.strftime('%Y-%m-%d %H:%M:%S')} (in {wait_seconds/3600:.1f}h)")
time.sleep(wait_seconds)
self._flush_all_agents()
last_run_date = datetime.datetime.now().date()
except Exception as e:
logger.warning(f"[DailyFlush] Error in daily flush loop: {e}")
time.sleep(3600)

View File

@@ -61,6 +61,11 @@ class Bridge(object):
if model_type and model_type.startswith("deepseek"):
self.btype["chat"] = const.DEEPSEEK
if model_type and isinstance(model_type, str):
lowered_model_type = model_type.lower()
if lowered_model_type == const.QIANFAN or lowered_model_type.startswith("ernie"):
self.btype["chat"] = const.QIANFAN
if model_type in [const.MODELSCOPE]:
self.btype["chat"] = const.MODELSCOPE

View File

@@ -55,12 +55,186 @@ def _ensure_lark_imported():
return lark
def _print_qr_to_terminal(qr_url: str):
"""Render a QR code as ASCII art and emit it via logger.
走 logger 而非 print 是为了避免 nohup/cow 后台启动场景下 stdout 块缓冲导致
二维码滞后输出看起来像出现了两次。logger 的 StreamHandler 是行缓冲,
既能在前台终端看到,也能进 run.log。
"""
qr_lines = []
try:
import qrcode as qr_lib
import io
qr = qr_lib.QRCode(error_correction=qr_lib.constants.ERROR_CORRECT_L, box_size=1, border=1)
qr.add_data(qr_url)
qr.make(fit=True)
buf = io.StringIO()
qr.print_ascii(out=buf, invert=True)
qr_lines = buf.getvalue().splitlines()
except ImportError:
qr_lines = ["(未安装 qrcode 包,无法渲染 ASCII 二维码pip install qrcode)"]
except Exception as e:
qr_lines = [f"(渲染二维码失败:{e})"]
header = "=" * 60
banner = [
"",
header,
" 飞书一键创建应用:请使用 飞书 App 扫描下方二维码",
" (二维码 10 分钟内有效,仅供一次扫描)",
header,
]
footer = [
f" 或点击链接创建: {qr_url}",
" 等待扫码...",
"",
]
full = banner + qr_lines + footer
logger.info("[FeiShu] One-click 飞书应用创建二维码(请用飞书 App 扫码):\n" + "\n".join(full))
def _persist_feishu_credentials(app_id: str, app_secret: str) -> bool:
"""Write feishu_app_id / feishu_app_secret + ensure feishu in channel_type into config.json.
Returns True on success, False on failure (e.g. config.json missing or unwritable).
"""
try:
config_path = os.path.join(
os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__)))),
"config.json",
)
if os.path.exists(config_path):
with open(config_path, "r", encoding="utf-8") as f:
file_cfg = json.load(f)
else:
file_cfg = {}
file_cfg["feishu_app_id"] = app_id
file_cfg["feishu_app_secret"] = app_secret
# 保证 channel_type 中包含 feishu用户可能纯通过 CLI 启动单通道)
ch_type = file_cfg.get("channel_type", conf().get("channel_type", "")) or ""
existing = [s.strip() for s in ch_type.split(",") if s.strip()]
if "feishu" not in existing:
existing.append("feishu")
file_cfg["channel_type"] = ",".join(existing)
with open(config_path, "w", encoding="utf-8") as f:
json.dump(file_cfg, f, indent=4, ensure_ascii=False)
# 同步到内存中的 conf(),让本次启动直接生效
conf()["feishu_app_id"] = app_id
conf()["feishu_app_secret"] = app_secret
if "channel_type" in file_cfg:
conf()["channel_type"] = file_cfg["channel_type"]
try:
os.chmod(config_path, 0o600)
except Exception:
pass
return True
except Exception as e:
logger.error(f"[FeiShu] Failed to persist credentials to config.json: {e}")
return False
def _register_via_qr_in_terminal() -> bool:
"""CLI-side one-click app creation via lark_oapi.register_app.
Blocks the calling thread (typically the channel startup thread) until the user
finishes scanning, the QR code expires, or registration is cancelled.
Returns True if credentials were obtained AND persisted; False otherwise.
The caller should fall back to the original "missing credentials" error in that case.
"""
if not LARK_SDK_AVAILABLE:
logger.error(
"[FeiShu] 缺少 feishu_app_id / feishu_app_secret。"
"未安装 lark-oapi SDK无法在终端发起扫码创建。"
"请执行 pip install -U 'lark-oapi>=1.5.5' 后重试,或手动在 config.json 中填入凭据。"
)
return False
try:
lark_mod = _ensure_lark_imported()
except Exception as e:
logger.error(f"[FeiShu] Import lark_oapi failed: {e}")
return False
# register_app 是 lark-oapi 1.5.5 才引入的能力,旧版本调用会得到难以理解的
# AttributeError。提前显式检查给出明确的升级提示。
if not hasattr(lark_mod, "register_app"):
try:
from importlib.metadata import version as _pkg_version
installed = _pkg_version("lark-oapi")
except Exception:
installed = "unknown"
logger.error(
f"[FeiShu] 当前 lark-oapi 版本 ({installed}) 不支持一键创建应用,需要 >= 1.5.5。"
"请执行 pip install -U 'lark-oapi>=1.5.5' 后重试,或手动在 config.json 中填入凭据。"
)
return False
logger.info("[FeiShu] 检测到尚未配置 feishu_app_id / feishu_app_secret"
"正在向飞书申请一键创建应用...")
def _on_qr(info):
url = info.get("url", "")
if url:
_print_qr_to_terminal(url)
def _on_status(info):
# 过滤 polling 心跳(每 5 秒一次),保留 slow_down / domain_switched 等
status = info.get("status")
if status == "polling":
return
logger.info(f"[FeiShu] register_app status: {info}")
try:
result = lark_mod.register_app(
on_qr_code=_on_qr,
on_status_change=_on_status,
source="cowagent",
)
except Exception as e:
err_cls = e.__class__.__name__
if "Expired" in err_cls:
logger.error("[FeiShu] 二维码已过期,请重启程序后重试。")
elif "Denied" in err_cls:
logger.error("[FeiShu] 已取消授权。")
else:
logger.error(f"[FeiShu] 一键创建失败:{e}")
return False
app_id = result.get("client_id", "")
app_secret = result.get("client_secret", "")
if not app_id or not app_secret:
logger.error("[FeiShu] 创建结果缺少 app_id/app_secret无法继续。")
return False
if not _persist_feishu_credentials(app_id, app_secret):
logger.error(
"[FeiShu] 应用创建成功但写入 config.json 失败,请手动复制以下值到配置文件:\n"
f" feishu_app_id = {app_id}\n"
f" feishu_app_secret = {app_secret}"
)
return False
logger.info(f"[FeiShu] 应用创建成功,凭据已写入 config.json (app_id={app_id})。")
return True
@singleton
class FeiShuChanel(ChatChannel):
feishu_app_id = conf().get('feishu_app_id')
feishu_app_secret = conf().get('feishu_app_secret')
feishu_token = conf().get('feishu_token')
feishu_event_mode = conf().get('feishu_event_mode', 'websocket') # webhook 或 websocket
# 覆盖父类默认值 [ReplyType.VOICE, ReplyType.IMAGE]。
# 飞书原生支持发送音频opus 格式,通过文件上传接口)和图片,
# 所有回复类型均已处理,置为空列表以启用语音和图片回复。
NOT_SUPPORT_REPLYTYPE = []
def __init__(self):
super().__init__()
@@ -86,6 +260,20 @@ class FeiShuChanel(ChatChannel):
self.feishu_app_secret = conf().get('feishu_app_secret')
self.feishu_token = conf().get('feishu_token')
self.feishu_event_mode = conf().get('feishu_event_mode', 'websocket')
# 命令行启动场景:缺少凭据时尝试通过 lark.register_app 在终端弹二维码
# 引导用户扫码创建应用。Web 控制台启动同样会走到这里,但控制台用户通常
# 已经通过 /api/feishu/register 完成了创建并写回 config.json。
if not self.feishu_app_id or not self.feishu_app_secret:
if _register_via_qr_in_terminal():
self.feishu_app_id = conf().get('feishu_app_id')
self.feishu_app_secret = conf().get('feishu_app_secret')
else:
err = "[FeiShu] feishu_app_id 与 feishu_app_secret 缺失,无法启动通道"
logger.error(err)
self.report_startup_error(err)
return
self._fetch_bot_open_id()
if self.feishu_event_mode == 'websocket':
self._startup_websocket()
@@ -384,10 +572,22 @@ class FeiShuChanel(ChatChannel):
no_need_at=True
)
if context:
# 流式回复模式:向 context 注入 on_event 回调agent 每产出一段文字时会调用它。
# 回调内部先发送一条占位消息获取 message_id之后通过 PATCH 接口原地更新内容,
# 实现打字机效果。回调结束时设置 context["feishu_streamed"]=True
# 让 send() 跳过重复发送,避免最终完整回复再被重复投递一次。
# 默认开启流式打字机回复。需机器人开通 cardkit:card:write 权限且飞书客户端 7.20+
# 任意环节失败会自动降级为非流式文本回复。
if conf().get("feishu_stream_reply", True):
context["on_event"] = self._make_feishu_stream_callback(context, feishu_msg.access_token)
self.produce(context)
logger.debug(f"[FeiShu] query={feishu_msg.content}, type={feishu_msg.ctype}")
def send(self, reply: Reply, context: Context):
# 如果文本回复已通过流式传输发送,则跳过重复发送
if reply.type == ReplyType.TEXT and context.get("feishu_streamed"):
logger.debug("[FeiShu] streaming already delivered text reply, skipping send()")
return
msg = context.get("msg")
is_group = context["isgroup"]
if msg:
@@ -450,6 +650,16 @@ class FeiShuChanel(ChatChannel):
msg_type = "file"
content_key = "file_key"
elif reply.type == ReplyType.VOICE:
# 语音回复:上传音频文件到飞书,然后发送 audio 类型消息
file_key = self._upload_audio(reply.content, access_token)
if not file_key:
logger.warning("[FeiShu] upload audio failed")
return
reply_content = file_key
msg_type = "audio"
content_key = "file_key"
# Check if we can reply to an existing message (need msg_id)
can_reply = is_group and msg and hasattr(msg, 'msg_id') and msg.msg_id
@@ -481,6 +691,396 @@ class FeiShuChanel(ChatChannel):
else:
logger.error(f"[FeiShu] send message failed, code={res.get('code')}, msg={res.get('msg')}")
def _make_feishu_stream_callback(self, context, access_token):
"""
基于飞书官方"流式更新卡片"API 实现打字机回复。
流程:
1. message_update 首次到达 → POST /cardkit/v1/cards 创建带 streaming_mode 的卡片实体,
随后用 POST /im/v1/messages或 reply以 card_id 把卡片发出去
2. 后续 message_update → PUT /cardkit/v1/cards/{id}/elements/{eid}/content
传入"当前轮"的全量文本,飞书平台自动计算增量并以打字机效果上屏
(流式模式下不受 10 QPS 限制)
3. message_end一轮 LLM 输出结束,且本轮触发了工具调用)→ 把 current 累计到 committed
并加入分隔符;下一轮 message_update 又从空白开始,避免多轮内容串到一起
4. agent_end → 用 final_response 强制覆盖卡片,再 PATCH /cardkit/v1/cards/{id}/settings
关闭 streaming_mode标记 context["feishu_streamed"]=True 让 chat_channel 跳过普通 send()
前提条件:
- 机器人已开通 cardkit:card:write 权限
- 飞书客户端 7.20+
失败降级:
- 创建卡片实体失败(缺权限、网络等)→ 不设置 feishu_streamed 标记,让 chat_channel
走普通文本回复路径,用户收到完整回复但无打字机效果,并打 warning 日志
"""
# 共享状态(受 lock 保护)
# 多轮 agent 模式下,每个"中间过场消息"会作为一张独立卡片发送。
# current_text 只承载当前正在流式渲染的那张卡片的内容message_end / agent_end
# 时会把它定型并 reset。
current_text = [""] # 当前卡片正在累加的 LLM 输出
card_id = [None] # 当前流式卡片的实体 ID每段独立
message_id = [None] # 当前卡片发送后的消息 ID仅日志用
# 占位发送是同步进行的,但用一个 in-flight 标记防止并发的多条 message_update
# 事件各自触发一次创建+发送,导致发出多张卡片。
init_in_flight = [False]
# 一旦初始化失败就长期标记为 disabled本次回复不再尝试任何流式调用
disabled = [False]
lock = threading.Lock()
# ---- 异步推送队列 ----------------------------------------------------
# 同步 requests.put 单次 100~300ms会阻塞 LLM stream 线程读下一个 chunk。
# 把推送丢给独立 worker 线程消费 queue回调本身只做内存追加立即返回。
# 队列里只放"最新累积文本"的快照worker 用 deduplication 避免重复推同一个
# 内容(高频 chunk 场景下队列会堆积,只推最后一个就够了)。
import queue as _queue
push_queue: "_queue.Queue[str | None]" = _queue.Queue()
def _push_worker():
while True:
snapshot = push_queue.get()
if snapshot is None:
push_queue.task_done()
return
# 合并队列中已堆积的快照:只推最后一个,省 PUT 次数同时降低延迟
merged_count = 1
stop = False
while True:
try:
nxt = push_queue.get_nowait()
except _queue.Empty:
break
merged_count += 1
if nxt is None:
stop = True
break
snapshot = nxt
try:
_stream_update_text(snapshot)
finally:
for _ in range(merged_count):
push_queue.task_done()
if stop:
return
push_thread = threading.Thread(target=_push_worker, daemon=True, name="feishu-stream-push")
push_thread.start()
def _drain_push_queue():
"""等当前队列里所有 PUT 都完成。message_end/agent_end 在做最终定型前必须 drain
否则 worker 里堆积的旧快照可能在 final_text PUT 之后到达,把最终内容覆盖掉。"""
try:
push_queue.join()
except Exception:
pass
msg = context.get("msg")
is_group = context.get("isgroup", False)
receiver = context.get("receiver")
receive_id_type = context.get("receive_id_type", "open_id")
# 客户端打字机渲染参数(飞书 App 侧实际"出字"速度):
# - print_freq_ms每次刷新的间隔
# - print_step每次刷新出多少个字符
# 当前 40ms × 4 字 ≈ 100 字/秒,接近 ChatGPT/DeepSeek 网页端的节奏。
print_freq_ms = 40
print_step = 4
print_strategy = "fast"
headers = {
"Authorization": "Bearer " + access_token,
"Content-Type": "application/json; charset=utf-8",
}
# 卡片中富文本组件的 element_id后续所有 PUT 流式更新都打到这个组件
ELEMENT_ID = "stream_md"
# 操作序号,每次 PUT 必须严格递增(飞书要求)
sequence = [0]
def _next_sequence():
sequence[0] += 1
return sequence[0]
def _build_card_json():
"""卡片 JSON 2.0 结构 + streaming_mode + 单 markdown 组件"""
return json.dumps({
"schema": "2.0",
"config": {
"streaming_mode": True,
"summary": {"content": "[正在生成回复...]"},
"streaming_config": {
"print_frequency_ms": {"default": print_freq_ms},
"print_step": {"default": print_step},
"print_strategy": print_strategy,
},
},
"body": {
"elements": [
{
"tag": "markdown",
"content": "...",
"element_id": ELEMENT_ID,
}
],
},
# 注意JSON 2.0 不支持自定义 fallback 字段(传入会报错)。
# 客户端 < 7.20 时,飞书会自动展示"请升级客户端"占位,无需配置。
}, ensure_ascii=False)
def _create_and_send_card():
"""同步执行:创建卡片实体 → 发送消息。任意一步失败则 disabled=True 触发降级"""
try:
# 步骤 1: 创建卡片实体
create_url = "https://open.feishu.cn/open-apis/cardkit/v1/cards"
create_body = {"type": "card_json", "data": _build_card_json()}
res = requests.post(
create_url, headers=headers, json=create_body, timeout=(5, 10)
)
res_json = res.json()
if res_json.get("code") != 0:
logger.warning(
f"[FeiShu] Stream: create card failed "
f"(code={res_json.get('code')}, msg={res_json.get('msg')}). "
f"本次回复已自动降级为普通文本回复(一次性返回完整内容)。"
f"如需开启流式打字机效果与完整 Markdown 渲染,请到飞书开放平台 "
f"https://open.feishu.cn/app 给机器人开通 cardkit:card:write 权限"
f"(创建与更新卡片)并重新发布版本,同时确保飞书客户端 >= 7.20。"
)
with lock:
disabled[0] = True
return
cid = res_json["data"]["card_id"]
with lock:
card_id[0] = cid
# 步骤 2: 通过 card_id 发送消息(群聊优先用 reply单聊直接 send
content_payload = json.dumps(
{"type": "card", "data": {"card_id": cid}}, ensure_ascii=False
)
can_reply = is_group and msg and hasattr(msg, "msg_id") and msg.msg_id
if can_reply:
send_url = (
f"https://open.feishu.cn/open-apis/im/v1/messages/"
f"{msg.msg_id}/reply"
)
send_body = {"msg_type": "interactive", "content": content_payload}
send_res = requests.post(
send_url, headers=headers, json=send_body, timeout=(5, 10)
)
else:
send_url = "https://open.feishu.cn/open-apis/im/v1/messages"
params = {"receive_id_type": receive_id_type}
send_body = {
"receive_id": receiver,
"msg_type": "interactive",
"content": content_payload,
}
send_res = requests.post(
send_url, headers=headers, params=params, json=send_body,
timeout=(5, 10),
)
send_json = send_res.json()
if send_json.get("code") != 0:
logger.warning(
f"[FeiShu] Stream: send card failed: {send_json}. 降级为普通文本。"
)
with lock:
disabled[0] = True
return
mid = send_json["data"]["message_id"]
with lock:
message_id[0] = mid
logger.info(
f"[FeiShu] Stream: card created and sent, "
f"card_id={cid}, message_id={mid}"
)
except Exception as e:
logger.warning(
f"[FeiShu] Stream: create/send card exception: {e}. 降级为普通文本。"
)
with lock:
disabled[0] = True
finally:
with lock:
init_in_flight[0] = False
def _stream_update_text(full_text):
"""PUT 流式更新文本组件。content 必须是当前组件的全量文本。"""
with lock:
cid = card_id[0]
if not cid:
return
url = (
f"https://open.feishu.cn/open-apis/cardkit/v1/cards/"
f"{cid}/elements/{ELEMENT_ID}/content"
)
body = {
"content": full_text,
"sequence": _next_sequence(),
}
try:
res = requests.put(url, headers=headers, json=body, timeout=(5, 10))
res_json = res.json()
if res_json.get("code") != 0:
logger.warning(
f"[FeiShu] Stream: update text failed: {res_json}"
)
except Exception as e:
logger.warning(f"[FeiShu] Stream: update text exception: {e}")
def _close_streaming_mode(final_text: str = ""):
"""关闭流式模式(卡片转入"普通"状态,可被转发)。
同时通过整卡更新接口把 summary 改成最终内容的预览,否则飞书会话列表
会一直显示创建卡片时的占位摘要("[正在生成回复...]")。
"""
with lock:
cid = card_id[0]
if not cid:
return
# 1) 通过整卡更新接口把 streaming_mode 关掉,并改写 summary
# settings 接口的 config 不接受 summary 字段,会报 code=2200
preview_src = (final_text or "").strip().replace("\n", " ")
preview = preview_src[:30] if preview_src else ""
full_card = {
"schema": "2.0",
"config": {
"streaming_mode": False,
"summary": {"content": preview or " "},
},
"body": {
"elements": [
{
"tag": "markdown",
"content": final_text or " ",
"element_id": ELEMENT_ID,
}
],
},
}
put_url = f"https://open.feishu.cn/open-apis/cardkit/v1/cards/{cid}"
put_body = {
"card": {"type": "card_json", "data": json.dumps(full_card, ensure_ascii=False)},
"sequence": _next_sequence(),
}
try:
res = requests.put(put_url, headers=headers, json=put_body, timeout=(5, 10))
res_json = res.json()
if res_json.get("code") != 0:
logger.warning(
f"[FeiShu] Stream: finalize card (close+summary) failed: {res_json}"
)
except Exception as e:
logger.warning(
f"[FeiShu] Stream: finalize card exception: {e}"
)
def on_event(event: dict):
event_type = event.get("type")
data = event.get("data", {})
# 一旦降级,本次回复不再做任何流式操作
with lock:
if disabled[0]:
return
if event_type == "message_update":
delta = data.get("delta", "")
if not delta:
return
# 第一段:判断是否需要初始化(创建卡片 + 发送)
need_init = False
with lock:
if card_id[0] is None and not init_in_flight[0]:
init_in_flight[0] = True
need_init = True
if need_init:
_create_and_send_card()
# 初始化失败已标记 disabled下次循环直接 return
with lock:
if disabled[0]:
return
# 第二段:累加文本,把快照丢给 push worker 异步推送。
# 这里不能直接 requests.put否则会阻塞 LLM stream 线程读下一个 chunk
# (实测 DeepSeek 高频小 chunk 场景每个 PUT ~150ms累积起来非常卡
snapshot = ""
should_push = False
with lock:
current_text[0] += delta
if card_id[0]:
snapshot = current_text[0]
should_push = True
if should_push:
push_queue.put(snapshot)
elif event_type == "message_end":
# 一轮 LLM 输出结束。如果本轮触发了工具调用,说明当前轮的文本是
# "中间过场消息"(如"来看看!"),应该作为独立卡片定型,然后为下一轮
# 重新创建一张新卡片。这样最终用户看到的是:
# [卡片1: 中间过场1]
# [卡片2: 中间过场2]
# ...
# [卡片N: 最终回复]
# 与 wecom_bot 的多消息流式体验对齐。
tool_calls = data.get("tool_calls", []) or []
if not tool_calls:
# 没有工具调用:本轮即最终回复,留给 agent_end 统一处理。
return
with lock:
text_to_finalize = current_text[0].rstrip()
current_text[0] = ""
if not text_to_finalize:
return
# 等异步队列里堆积的快照都推完,避免它们晚于 final 文本到达把内容覆盖掉
_drain_push_queue()
# 用最终文本覆盖当前卡片并关闭流式模式(凝固成普通卡片,
# 同时把会话列表的 summary 改成预览,不再显示"正在生成回复..."
_stream_update_text(text_to_finalize)
_close_streaming_mode(text_to_finalize)
# 重置卡片状态,下一段 message_update 会触发新卡片的创建
with lock:
card_id[0] = None
message_id[0] = None
sequence[0] = 0
elif event_type == "agent_end":
# 最终回复:用 final_response 覆盖当前流式卡片,然后关闭流式模式。
final_response = data.get("final_response", "")
if not final_response:
return
final_text = str(final_response)
# 标记 streamed 让 chat_channel 跳过 send()
context["feishu_streamed"] = True
with lock:
has_card = card_id[0] is not None
init_busy = init_in_flight[0]
# 罕见情况agent_end 触发时还没创建过卡片(极快返回 / 没有
# message_update主动创建一张承载 final_text。
if not has_card and not init_busy:
with lock:
init_in_flight[0] = True
_create_and_send_card()
with lock:
if disabled[0]:
return
_drain_push_queue()
_stream_update_text(final_text)
_close_streaming_mode(final_text)
# 通知 push worker 退出(本次回复彻底结束)
push_queue.put(None)
return on_event
def fetch_access_token(self) -> str:
url = "https://open.feishu.cn/open-apis/auth/v3/tenant_access_token/internal/"
headers = {
@@ -687,6 +1287,66 @@ class FeiShuChanel(ChatChannel):
except Exception as e:
logger.warning(f"[FeiShu] Failed to remove temp file {temp_file}: {e}")
def _upload_audio(self, audio_path, access_token):
"""
Upload a local audio file to Feishu and return file_key.
audio_path is a plain local file path (no file:// prefix).
Feishu audio messages only support opus format; non-opus files are converted first.
"""
logger.debug(f"[FeiShu] start upload audio, path={audio_path}")
if not os.path.exists(audio_path):
logger.error(f"[FeiShu] audio file not found: {audio_path}")
return None
# Feishu only plays audio messages in opus format.
# Convert if the TTS engine produced a different format (e.g. mp3 from OpenAI TTS).
upload_path = audio_path
if not audio_path.lower().endswith('.opus'):
opus_path = os.path.splitext(audio_path)[0] + '.opus'
try:
from pydub import AudioSegment
audio = AudioSegment.from_file(audio_path)
audio.export(opus_path, format='opus')
upload_path = opus_path
logger.info(f"[FeiShu] Converted audio to opus: {opus_path}")
except Exception as e:
logger.warning(f"[FeiShu] Failed to convert audio to opus, uploading original: {e}")
upload_path = audio_path
file_name = os.path.splitext(os.path.basename(upload_path))[0] + '.opus'
upload_url = "https://open.feishu.cn/open-apis/im/v1/files"
data = {'file_type': 'opus', 'file_name': file_name}
headers = {'Authorization': f'Bearer {access_token}'}
try:
with open(upload_path, "rb") as f:
upload_response = requests.post(
upload_url,
files={"file": f},
data=data,
headers=headers,
timeout=(5, 30)
)
logger.info(
f"[FeiShu] upload audio response, status={upload_response.status_code}, res={upload_response.content}")
response_data = upload_response.json()
if response_data.get("code") == 0:
return response_data.get("data").get("file_key")
else:
logger.error(f"[FeiShu] upload audio failed: {response_data}")
return None
except Exception as e:
logger.error(f"[FeiShu] upload audio exception: {e}")
return None
finally:
# 无论上传成功与否都清理转换产生的临时 opus 文件,避免失败路径下磁盘堆积。
if upload_path != audio_path and os.path.exists(upload_path):
try:
os.remove(upload_path)
except Exception as e:
logger.warning(f"[FeiShu] Failed to remove temp opus file {upload_path}: {e}")
def _upload_file_url(self, file_url, access_token):
"""
Upload file to Feishu

View File

@@ -162,6 +162,38 @@ class FeishuMessage(ChatMessage):
else:
logger.info(f"[FeiShu] Failed to download file, key={file_key}, res={response.text}")
self._prepare_fn = _download_file
elif msg_type == "audio":
# 飞书用户发送的语音消息类型为 "audio",文件为 opus 编码格式。
# 映射为 ContextType.VOICE交由 chat_channel 的语音转文字STT流程处理。
# 文件通过 _prepare_fn 延迟下载,在 chat_channel 调用 cmsg.prepare() 时才执行。
self.ctype = ContextType.VOICE
content = json.loads(msg.get("content"))
file_key = content.get("file_key")
self.content = TmpDir().path() + file_key + ".opus"
logger.info(f"[FeiShu] audio message: file_key={file_key}, save_path={self.content}")
def _download_audio():
logger.info(f"[FeiShu] downloading audio: file_key={file_key}, msg_id={self.msg_id}")
url = f"https://open.feishu.cn/open-apis/im/v1/messages/{self.msg_id}/resources/{file_key}"
headers = {
"Authorization": "Bearer " + access_token,
}
params = {
"type": "file"
}
try:
response = requests.get(url=url, headers=headers, params=params)
logger.info(f"[FeiShu] download audio response: status={response.status_code}, size={len(response.content)} bytes")
if response.status_code == 200:
with open(self.content, "wb") as f:
f.write(response.content)
logger.info(f"[FeiShu] audio saved to: {self.content}")
else:
logger.error(f"[FeiShu] Failed to download audio, key={file_key}, status={response.status_code}, res={response.text}")
except Exception as e:
logger.error(f"[FeiShu] Exception downloading audio, key={file_key}: {e}", exc_info=True)
self._prepare_fn = _download_audio
else:
raise NotImplementedError("Unsupported message type: Type:{} ".format(msg_type))

View File

@@ -78,6 +78,19 @@ const I18N = {
wecom_scan_success: '创建成功,正在启动通道...',
wecom_scan_fail: '创建失败',
wecom_mode_scan: '扫码接入', wecom_mode_manual: '手动填写',
feishu_scan_btn: '一键创建飞书应用',
feishu_scan_desc: '使用飞书 App 扫码,自动创建应用并预置全部权限与事件订阅',
feishu_scan_replace_desc: '使用飞书 App 扫码创建新机器人,将覆盖当前的 App ID / Secret',
feishu_scan_loading: '正在向飞书申请二维码...',
feishu_scan_waiting: '等待扫码...',
feishu_scan_tip: '二维码 10 分钟内有效,仅供一次扫描',
feishu_scan_open_link: '或点击此处在浏览器中打开',
feishu_scan_success: '应用创建成功,正在启动通道...',
feishu_scan_expired: '二维码已过期,请重试',
feishu_scan_denied: '已取消授权',
feishu_scan_fail: '创建失败',
feishu_scan_retry: '重试',
feishu_mode_scan: '扫码创建', feishu_mode_manual: '手动填写',
tasks_title: '定时任务', tasks_desc: '查看和管理定时任务',
tasks_coming: '即将推出', tasks_coming_desc: '定时任务管理功能即将在此提供',
logs_title: '日志', logs_desc: '实时日志输出 (run.log)',
@@ -164,6 +177,19 @@ const I18N = {
wecom_scan_success: 'Bot created, starting channel...',
wecom_scan_fail: 'Bot creation failed',
wecom_mode_scan: 'Scan QR', wecom_mode_manual: 'Manual',
feishu_scan_btn: 'One-click Create Feishu App',
feishu_scan_desc: 'Scan with Feishu App to create an app with all required permissions pre-configured',
feishu_scan_replace_desc: 'Scan with Feishu App to create a new bot — will overwrite the current App ID / Secret',
feishu_scan_loading: 'Requesting QR code from Feishu...',
feishu_scan_waiting: 'Waiting for scan...',
feishu_scan_tip: 'QR code expires in 10 minutes, single use only',
feishu_scan_open_link: 'Or click here to open in browser',
feishu_scan_success: 'App created, starting channel...',
feishu_scan_expired: 'QR code expired, please retry',
feishu_scan_denied: 'Authorization cancelled',
feishu_scan_fail: 'App creation failed',
feishu_scan_retry: 'Retry',
feishu_mode_scan: 'Scan QR', feishu_mode_manual: 'Manual',
tasks_title: 'Scheduled Tasks', tasks_desc: 'View and manage scheduled tasks',
tasks_coming: 'Coming Soon', tasks_coming_desc: 'Scheduled task management will be available here',
logs_title: 'Logs', logs_desc: 'Real-time log output (run.log)',
@@ -2999,6 +3025,8 @@ function renderActiveChannels() {
const weixinWaiting = ch.name === 'weixin' && ch.login_status && ch.login_status !== 'logged_in';
const wecomNeedsCreds = ch.name === 'wecom_bot' && !_wecomBotHasCreds(ch);
// 飞书 active 卡片渲染带 Tab 的 panel手动填写 + 扫码重建(覆盖现有配置)
const isFeishu = ch.name === 'feishu';
let statusDot, statusText;
if (weixinWaiting) {
statusDot = 'bg-amber-400 animate-pulse';
@@ -3014,7 +3042,7 @@ function renderActiveChannels() {
}
card.innerHTML = `
<div class="flex items-center gap-4${hasFields || weixinWaiting || wecomNeedsCreds ? ' mb-5' : ''}">
<div class="flex items-center gap-4${hasFields || weixinWaiting || wecomNeedsCreds || isFeishu ? ' mb-5' : ''}">
<div class="w-10 h-10 rounded-xl bg-${ch.color}-50 dark:bg-${ch.color}-900/20 flex items-center justify-center flex-shrink-0">
<i class="fas ${ch.icon} text-${ch.color}-500 text-base"></i>
</div>
@@ -3050,7 +3078,7 @@ function renderActiveChannels() {
</button>
<div id="wecom-card-scan-status" class="mt-3"></div>
</div>` : ''}
${hasFields ? `<div class="space-y-4">
${isFeishu ? buildFeishuPanel(ch, true) : (hasFields ? `<div class="space-y-4">
${fieldsHtml}
<div class="flex items-center justify-end gap-3 pt-1">
<span id="ch-status-${ch.name}" class="text-xs text-primary-500 opacity-0 transition-opacity duration-300"></span>
@@ -3059,7 +3087,7 @@ function renderActiveChannels() {
cursor-pointer transition-colors duration-150 disabled:opacity-50 disabled:cursor-not-allowed"
id="ch-save-${ch.name}">${t('channels_save')}</button>
</div>
</div>` : ''}`;
</div>` : '')}`;
container.appendChild(card);
bindSecretFieldEvents(card);
@@ -3256,6 +3284,7 @@ function openAddChannelPanel() {
function closeAddChannelPanel() {
stopWeixinQrPoll();
stopFeishuRegisterPoll();
const panel = document.getElementById('channels-add-panel');
if (panel) {
panel.classList.add('hidden');
@@ -3267,6 +3296,7 @@ function closeAddChannelPanel() {
function onAddChannelSelect(chName) {
stopWeixinQrPoll();
stopFeishuRegisterPoll();
const fieldsContainer = document.getElementById('add-channel-fields');
const actions = document.getElementById('add-channel-actions');
@@ -3293,6 +3323,13 @@ function onAddChannelSelect(chName) {
return;
}
if (chName === 'feishu') {
actions.classList.add('hidden');
const ch = channelsData.find(c => c.name === chName);
fieldsContainer.innerHTML = buildFeishuPanel(ch);
return;
}
const ch = channelsData.find(c => c.name === chName);
if (!ch) return;
@@ -3690,15 +3727,246 @@ function startWecomBotAuthInCard() {
// Initialize wecom bot panel with correct default mode when inserted into DOM
document.addEventListener('DOMContentLoaded', function() {
const observer = new MutationObserver(function() {
const panel = document.getElementById('wecom-bot-panel');
if (panel && !panel.dataset.initialized) {
panel.dataset.initialized = '1';
switchWecomBotMode(panel.dataset.defaultMode || 'scan');
const wecomPanel = document.getElementById('wecom-bot-panel');
if (wecomPanel && !wecomPanel.dataset.initialized) {
wecomPanel.dataset.initialized = '1';
switchWecomBotMode(wecomPanel.dataset.defaultMode || 'scan');
}
const feishuPanel = document.getElementById('feishu-panel');
if (feishuPanel && !feishuPanel.dataset.initialized) {
feishuPanel.dataset.initialized = '1';
switchFeishuMode(feishuPanel.dataset.defaultMode || 'scan');
}
});
observer.observe(document.body, { childList: true, subtree: true });
});
// =====================================================================
// Feishu One-click App Registration (lark-oapi register_app)
// =====================================================================
let _feishuRegisterPollTimer = null;
function _feishuHasCreds(ch) {
if (!ch || !ch.fields) return false;
const idField = ch.fields.find(f => f.key === 'feishu_app_id');
const secretField = ch.fields.find(f => f.key === 'feishu_app_secret');
return !!(idField && idField.value && secretField && secretField.value);
}
function buildFeishuPanel(ch, isActive) {
const scanLabel = t('feishu_mode_scan');
const manualLabel = t('feishu_mode_manual');
// 已有凭据时默认进入手动 Tab方便修改否则推荐扫码
const defaultMode = _feishuHasCreds(ch) ? 'manual' : 'scan';
const activeAttr = isActive ? 'data-active="1"' : '';
return `
<div id="feishu-panel" data-default-mode="${defaultMode}" ${activeAttr}>
<div class="flex items-center justify-center gap-1 mb-5 bg-slate-100 dark:bg-white/5 rounded-lg p-1">
<button id="feishu-tab-scan" onclick="switchFeishuMode('scan')"
class="flex-1 px-3 py-1.5 rounded-md text-xs font-medium transition-colors
bg-white dark:bg-slate-700 text-slate-800 dark:text-slate-100 shadow-sm">
${scanLabel}
</button>
<button id="feishu-tab-manual" onclick="switchFeishuMode('manual')"
class="flex-1 px-3 py-1.5 rounded-md text-xs font-medium transition-colors
text-slate-500 dark:text-slate-400 hover:text-slate-700 dark:hover:text-slate-200">
${manualLabel}
</button>
</div>
<div id="feishu-mode-content"></div>
</div>`;
}
function switchFeishuMode(mode) {
const panel = document.getElementById('feishu-panel');
const scanTab = document.getElementById('feishu-tab-scan');
const manualTab = document.getElementById('feishu-tab-manual');
const content = document.getElementById('feishu-mode-content');
if (!scanTab || !manualTab || !content) return;
// 已激活通道卡片中嵌入此 panel 时,没有 add-channel-actions保存按钮就近渲染
const isActive = panel && panel.dataset.active === '1';
const actions = isActive ? null : document.getElementById('add-channel-actions');
const activeClasses = 'bg-white dark:bg-slate-700 text-slate-800 dark:text-slate-100 shadow-sm';
const inactiveClasses = 'text-slate-500 dark:text-slate-400 hover:text-slate-700 dark:hover:text-slate-200';
stopFeishuRegisterPoll();
if (mode === 'scan') {
scanTab.className = `flex-1 px-3 py-1.5 rounded-md text-xs font-medium transition-colors ${activeClasses}`;
manualTab.className = `flex-1 px-3 py-1.5 rounded-md text-xs font-medium transition-colors ${inactiveClasses}`;
if (actions) actions.classList.add('hidden');
// active 卡片下扫码替换的提示文案,强调"创建新机器人会覆盖现有配置"
const desc = isActive
? t('feishu_scan_replace_desc')
: t('feishu_scan_desc');
content.innerHTML = `
<div id="feishu-scan-panel" class="flex flex-col items-center py-4">
<p class="text-sm text-slate-600 dark:text-slate-300 mb-3 text-center">${desc}</p>
<button onclick="startFeishuRegister()"
class="mt-2 px-6 py-2.5 rounded-lg bg-emerald-500 hover:bg-emerald-600 text-white text-sm font-medium
cursor-pointer transition-colors duration-150">
<i class="fas fa-qrcode mr-2"></i>${t('feishu_scan_btn')}
</button>
<div id="feishu-scan-status" class="mt-4 w-full"></div>
</div>`;
} else {
manualTab.className = `flex-1 px-3 py-1.5 rounded-md text-xs font-medium transition-colors ${activeClasses}`;
scanTab.className = `flex-1 px-3 py-1.5 rounded-md text-xs font-medium transition-colors ${inactiveClasses}`;
const ch = channelsData.find(c => c.name === 'feishu');
const fieldsHtml = buildChannelFieldsHtml('feishu', ch ? ch.fields || [] : []);
if (isActive) {
// 已接入卡片:内置保存按钮,复用 saveChannelConfig 走 update 流程
content.innerHTML = `
<div class="space-y-4">
${fieldsHtml}
<div class="flex items-center justify-end gap-3 pt-1">
<span id="ch-status-feishu" class="text-xs text-primary-500 opacity-0 transition-opacity duration-300"></span>
<button onclick="saveChannelConfig('feishu')"
class="px-4 py-2 rounded-lg bg-primary-500 hover:bg-primary-600 text-white text-sm font-medium
cursor-pointer transition-colors duration-150 disabled:opacity-50 disabled:cursor-not-allowed"
id="ch-save-feishu">${t('channels_save')}</button>
</div>
</div>`;
} else {
content.innerHTML = `<div class="space-y-4">${fieldsHtml}</div>`;
if (actions) actions.classList.remove('hidden');
}
bindSecretFieldEvents(content);
}
}
function stopFeishuRegisterPoll() {
if (_feishuRegisterPollTimer) {
clearTimeout(_feishuRegisterPollTimer);
_feishuRegisterPollTimer = null;
}
}
function startFeishuRegister(targetStatusId) {
const statusId = targetStatusId || 'feishu-scan-status';
const statusEl = document.getElementById(statusId);
if (statusEl) {
statusEl.innerHTML = `<p class="text-sm text-slate-500 dark:text-slate-400 text-center">${t('feishu_scan_loading')}</p>`;
}
stopFeishuRegisterPoll();
fetch('/api/feishu/register')
.then(r => r.json())
.then(data => {
if (data.status !== 'success') {
renderFeishuRegisterError(statusId, data.message || t('feishu_scan_fail'));
return;
}
renderFeishuQr(statusId, data.qr_image, data.qrcode_url);
pollFeishuRegisterStatus(statusId);
})
.catch(err => {
renderFeishuRegisterError(statusId, err.message || t('feishu_scan_fail'));
});
}
function renderFeishuQr(statusId, qrImage, qrUrl) {
const statusEl = document.getElementById(statusId);
if (!statusEl) return;
const imgHtml = qrImage
? `<img src="${qrImage}" alt="QR" class="w-44 h-44 rounded-lg border border-slate-200 dark:border-white/10 bg-white p-2"/>`
: `<div class="w-44 h-44 rounded-lg border border-dashed border-slate-300 flex items-center justify-center text-xs text-slate-400">QR</div>`;
statusEl.innerHTML = `
<div class="flex flex-col items-center gap-3">
${imgHtml}
<p class="text-xs text-amber-500">${t('feishu_scan_waiting')}</p>
<p class="text-xs text-slate-400 dark:text-slate-500">${t('feishu_scan_tip')}</p>
${qrUrl ? `<a href="${qrUrl}" target="_blank" rel="noopener"
class="text-xs text-blue-500 hover:text-blue-600 underline">${t('feishu_scan_open_link')}</a>` : ''}
</div>`;
}
function renderFeishuRegisterError(statusId, message) {
const statusEl = document.getElementById(statusId);
if (!statusEl) return;
statusEl.innerHTML = `
<div class="flex flex-col items-center gap-2 py-2">
<p class="text-sm text-red-500 text-center">${message}</p>
<button onclick="startFeishuRegister('${statusId}')"
class="mt-1 px-4 py-1.5 rounded-md text-xs font-medium
bg-slate-100 dark:bg-white/10 text-slate-700 dark:text-slate-200
hover:bg-slate-200 dark:hover:bg-white/20 cursor-pointer">
<i class="fas fa-rotate-right mr-1"></i>${t('feishu_scan_retry')}
</button>
</div>`;
}
function pollFeishuRegisterStatus(statusId) {
stopFeishuRegisterPoll();
_feishuRegisterPollTimer = setTimeout(() => {
fetch('/api/feishu/register', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ action: 'poll' })
})
.then(r => r.json())
.then(data => {
if (data.status !== 'success') {
renderFeishuRegisterError(statusId, data.message || t('feishu_scan_fail'));
return;
}
const rs = data.register_status;
if (rs === 'done') {
const statusEl = document.getElementById(statusId);
if (statusEl) {
statusEl.innerHTML = `
<div class="flex flex-col items-center py-2">
<div class="w-10 h-10 rounded-full bg-emerald-50 dark:bg-emerald-900/30 flex items-center justify-center mb-2">
<i class="fas fa-check text-emerald-500 text-lg"></i>
</div>
<p class="text-sm font-medium text-emerald-600 dark:text-emerald-400">${t('feishu_scan_success')}</p>
</div>`;
}
connectFeishuAfterRegister(data.app_id, data.app_secret);
} else if (rs === 'expired') {
renderFeishuRegisterError(statusId, t('feishu_scan_expired'));
} else if (rs === 'denied') {
renderFeishuRegisterError(statusId, t('feishu_scan_denied'));
} else if (rs === 'error') {
renderFeishuRegisterError(statusId, data.message || t('feishu_scan_fail'));
} else {
pollFeishuRegisterStatus(statusId);
}
})
.catch(() => {
pollFeishuRegisterStatus(statusId);
});
}, 2000);
}
function connectFeishuAfterRegister(appId, appSecret) {
fetch('/api/channels', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
action: 'connect',
channel: 'feishu',
config: { feishu_app_id: appId, feishu_app_secret: appSecret }
})
})
.then(r => r.json())
.then(data => {
if (data.status === 'success') {
const ch = channelsData.find(c => c.name === 'feishu');
if (ch) {
ch.active = true;
(ch.fields || []).forEach(f => {
if (f.key === 'feishu_app_id') f.value = appId;
if (f.key === 'feishu_app_secret') f.value = ChannelsHandler_maskSecret(appSecret);
});
}
setTimeout(() => renderActiveChannels(), 1500);
}
})
.catch(() => {});
}
// =====================================================================
// Scheduler View
// =====================================================================

View File

@@ -575,6 +575,7 @@ class WebChannel(ChatChannel):
'/config', 'ConfigHandler',
'/api/channels', 'ChannelsHandler',
'/api/weixin/qrlogin', 'WeixinQrHandler',
'/api/feishu/register', 'FeishuRegisterHandler',
'/api/tools', 'ToolsHandler',
'/api/skills', 'SkillsHandler',
'/api/memory', 'MemoryHandler',
@@ -779,6 +780,7 @@ class ConfigHandler:
const.QWEN36_PLUS, const.QWEN35_PLUS, const.QWEN3_MAX,
const.DOUBAO_SEED_2_PRO, const.DOUBAO_SEED_2_CODE,
const.KIMI_K2_6, const.KIMI_K2_5, const.KIMI_K2,
const.ERNIE_5, const.ERNIE_X1_1, const.ERNIE_45_TURBO_128K, const.ERNIE_45_TURBO_32K,
]
# Generic placeholder hints surfaced in the web console. We deliberately
@@ -787,6 +789,7 @@ class ConfigHandler:
# never looks like a real default a user might paste verbatim — and we
# never auto-rewrite anything on the server side.
_PLACEHOLDER_V1 = "https://...../v1"
_PLACEHOLDER_QIANFAN = "https://...../v2"
_PLACEHOLDER_ZHIPU = "https://...../api/paas/v4"
_PLACEHOLDER_DOUBAO = "https://...../api/v3"
_PLACEHOLDER_GEMINI = "https://....."
@@ -864,6 +867,14 @@ class ConfigHandler:
"api_base_placeholder": _PLACEHOLDER_V1,
"models": [const.KIMI_K2_6, const.KIMI_K2_5, const.KIMI_K2],
}),
("qianfan", {
"label": "百度千帆",
"api_key_field": "qianfan_api_key",
"api_base_key": "qianfan_api_base",
"api_base_default": "https://qianfan.baidubce.com/v2",
"api_base_placeholder": _PLACEHOLDER_QIANFAN,
"models": [const.ERNIE_5, const.ERNIE_X1_1, const.ERNIE_45_TURBO_128K, const.ERNIE_45_TURBO_32K],
}),
("modelscope", {
"label": "ModelScope",
"api_key_field": "modelscope_api_key",
@@ -892,9 +903,9 @@ class ConfigHandler:
EDITABLE_KEYS = {
"model", "bot_type", "use_linkai",
"open_ai_api_base", "deepseek_api_base", "claude_api_base", "gemini_api_base",
"open_ai_api_base", "deepseek_api_base", "qianfan_api_base", "claude_api_base", "gemini_api_base",
"zhipu_ai_api_base", "moonshot_base_url", "ark_base_url", "custom_api_base",
"open_ai_api_key", "deepseek_api_key", "claude_api_key", "gemini_api_key",
"open_ai_api_key", "deepseek_api_key", "qianfan_api_key", "claude_api_key", "gemini_api_key",
"zhipu_ai_api_key", "dashscope_api_key", "moonshot_api_key",
"ark_api_key", "minimax_api_key", "linkai_api_key", "custom_api_key",
"agent_max_context_tokens", "agent_max_context_turns", "agent_max_steps",
@@ -1034,8 +1045,6 @@ class ChannelsHandler:
"fields": [
{"key": "feishu_app_id", "label": "App ID", "type": "text"},
{"key": "feishu_app_secret", "label": "App Secret", "type": "secret"},
{"key": "feishu_token", "label": "Verification Token", "type": "secret"},
{"key": "feishu_bot_name", "label": "Bot Name", "type": "text"},
],
}),
("dingtalk", {
@@ -1530,6 +1539,174 @@ class WeixinQrHandler:
return json.dumps({"status": "success", "qr_status": qr_status})
class FeishuRegisterHandler:
"""飞书智能体应用一键创建OAuth 设备授权流,基于 lark.register_app SDK
GET /api/feishu/register → 启动注册:调用 SDK 生成二维码 URL立即返回
后台线程继续轮询飞书侧直到用户扫码授权。
POST /api/feishu/register → 轮询当前会话状态pending / done / error / expired
注册成功后不直接写 config由前端再调
/api/channels {action:'connect'} 走标准启用流程。
"""
# 进程内单例状态({url, expire_in, status, app_id, app_secret, error, thread})。
# 简单的本地自部署场景下不需要 session 隔离。
_state = {}
_lock = threading.Lock()
@staticmethod
def _qr_to_data_uri(data: str) -> str:
"""复用 WeixinQrHandler 的二维码渲染。"""
return WeixinQrHandler._qr_to_data_uri(data)
@classmethod
def _reset_state(cls):
with cls._lock:
cls._state = {}
@classmethod
def _start_register_thread(cls):
"""启动一次新的注册会话。如已有进行中的会话,先取消(通过 cancel_event"""
# 先取消可能存在的上一次会话,避免两个 SDK 线程并发 poll 同一个端点
with cls._lock:
old_cancel = cls._state.get("cancel_event") if cls._state else None
if old_cancel is not None:
old_cancel.set()
cancel_event = threading.Event()
cls._state = {"status": "starting", "cancel_event": cancel_event}
def _worker():
try:
import lark_oapi as lark
except ImportError:
with cls._lock:
cls._state["status"] = "error"
cls._state["error"] = "lark-oapi SDK 未安装,请执行 pip install -U lark-oapi"
return
def _on_qr(info):
# SDK 拿到二维码 URL 后立即回调;写入 state 让前端 GET 立刻能拿到
with cls._lock:
cls._state["url"] = info.get("url", "")
cls._state["expire_in"] = info.get("expire_in", 600)
cls._state["qr_image"] = cls._qr_to_data_uri(info.get("url", ""))
cls._state["status"] = "pending"
logger.info(f"[FeishuRegister] QR ready, expire_in={info.get('expire_in')}s")
def _on_status(info):
# 过滤掉 polling 心跳(每 5 秒一次,纯噪音);
# 保留 slow_down / domain_switched 等真正的状态切换事件
status = info.get("status")
if status == "polling":
return
logger.info(f"[FeishuRegister] SDK status: {info}")
try:
result = lark.register_app(
on_qr_code=_on_qr,
on_status_change=_on_status,
source="cowagent",
cancel_event=cancel_event,
)
with cls._lock:
cls._state["status"] = "done"
cls._state["app_id"] = result.get("client_id", "")
cls._state["app_secret"] = result.get("client_secret", "")
logger.info(f"[FeishuRegister] App created: app_id={result.get('client_id')}")
except Exception as e:
err_msg = str(e)
err_cls = e.__class__.__name__
# 飞书 SDK 抛出的 AppExpiredError / AppAccessDeniedError / RegisterAppError
if "Expired" in err_cls:
status = "expired"
elif "Denied" in err_cls:
status = "denied"
elif "abort" in err_msg.lower() or "cancel" in err_msg.lower():
# 被新一轮注册抢占,保持安静
return
else:
status = "error"
with cls._lock:
# 仅当当前 state 仍属于本次 worker 时才写入,避免覆盖更新的会话
if cls._state.get("cancel_event") is cancel_event:
cls._state["status"] = status
cls._state["error"] = err_msg
logger.warning(f"[FeishuRegister] Register failed ({err_cls}): {err_msg}")
threading.Thread(target=_worker, daemon=True, name="feishu-register").start()
def GET(self):
"""启动一次新的注册会话。如果已有 pending/done 会话则覆盖。"""
_require_auth()
web.header('Content-Type', 'application/json; charset=utf-8')
try:
self._start_register_thread()
# 等待 SDK 拿到二维码 URL最多 10s。SDK 内部会马上回调 _on_qr。
import time as _t
for _ in range(100):
with self._lock:
if self._state.get("url") or self._state.get("status") in ("error", "expired", "denied"):
break
_t.sleep(0.1)
with self._lock:
if self._state.get("status") in ("error", "expired", "denied"):
return json.dumps({
"status": "error",
"message": self._state.get("error", "register failed"),
})
if not self._state.get("url"):
return json.dumps({
"status": "error",
"message": "等待飞书二维码超时,请重试",
})
return json.dumps({
"status": "success",
"qrcode_url": self._state["url"],
"qr_image": self._state.get("qr_image", ""),
"expire_in": self._state.get("expire_in", 600),
})
except Exception as e:
logger.error(f"[WebChannel] FeishuRegister GET error: {e}")
return json.dumps({"status": "error", "message": str(e)})
def POST(self):
"""轮询注册结果。"""
_require_auth()
web.header('Content-Type', 'application/json; charset=utf-8')
try:
body = json.loads(web.data() or b"{}")
action = body.get("action", "poll")
if action != "poll":
return json.dumps({"status": "error", "message": f"unknown action: {action}"})
with self._lock:
status = self._state.get("status", "idle")
if status == "done":
payload = {
"status": "success",
"register_status": "done",
"app_id": self._state.get("app_id", ""),
"app_secret": self._state.get("app_secret", ""),
}
# 一次性返回凭据后清掉,避免敏感信息长期驻留内存
self._state = {}
return json.dumps(payload)
if status in ("error", "expired", "denied"):
return json.dumps({
"status": "success",
"register_status": status,
"message": self._state.get("error", ""),
})
# pending / starting还在等用户扫码
return json.dumps({
"status": "success",
"register_status": "pending",
})
except Exception as e:
logger.error(f"[WebChannel] FeishuRegister POST error: {e}")
return json.dumps({"status": "error", "message": str(e)})
def _get_workspace_root():
"""Resolve the agent workspace directory."""
from common.utils import expand_path

View File

@@ -1 +1 @@
2.0.7
2.0.8

View File

@@ -3,6 +3,7 @@ OPEN_AI = "openAI"
OPENAI = "openai"
CHATGPT = "chatGPT" # legacy alias for OPENAI, kept for backward compatibility
BAIDU = "baidu"
QIANFAN = "qianfan"
XUNFEI = "xunfei"
CHATGPTONAZURE = "chatGPTOnAzure"
LINKAI = "linkai"
@@ -85,6 +86,15 @@ DEEPSEEK_REASONER = "deepseek-reasoner" # DeepSeek-R1模型
DEEPSEEK_V4_FLASH = "deepseek-v4-flash" # DeepSeek V4 Flash - 默认推荐 (思考模式 + 工具调用)
DEEPSEEK_V4_PRO = "deepseek-v4-pro" # DeepSeek V4 Pro - 复杂任务更强 (思考模式 + 工具调用)
# Baidu Qianfan / ERNIE
ERNIE_5 = "ernie-5.0" # ERNIE 5.0 - default recommendation
ERNIE_X1_1 = "ernie-x1.1" # ERNIE X1.1 - reasoning-focused, multimodal
ERNIE_45_TURBO_128K = "ernie-4.5-turbo-128k"
ERNIE_45_TURBO_32K = "ernie-4.5-turbo-32k"
ERNIE_4_TURBO_8K = "ERNIE-4.0-Turbo-8K"
ERNIE_45_TURBO_VL = "ernie-4.5-turbo-vl"
ERNIE_45_TURBO_VL_32K = "ernie-4.5-turbo-vl-32k"
# Qwen (通义千问 - 阿里云 DashScope)
QWEN_TURBO = "qwen-turbo"
QWEN_PLUS = "qwen-plus"
@@ -159,6 +169,10 @@ MODEL_LIST = [
# DeepSeek
DEEPSEEK_V4_FLASH, DEEPSEEK_V4_PRO, DEEPSEEK_CHAT, DEEPSEEK_REASONER,
# Baidu Qianfan / ERNIE
QIANFAN, ERNIE_5, ERNIE_X1_1, ERNIE_45_TURBO_128K, ERNIE_45_TURBO_32K, ERNIE_4_TURBO_8K,
ERNIE_45_TURBO_VL, ERNIE_45_TURBO_VL_32K,
# MiniMax
MiniMax, MINIMAX_M2_7, MINIMAX_M2_7_HIGHSPEED, MINIMAX_M2_5, MINIMAX_M2_1, MINIMAX_M2_1_LIGHTNING, MINIMAX_M2, MINIMAX_ABAB6_5,

View File

@@ -3,6 +3,8 @@
"model": "deepseek-v4-flash",
"deepseek_api_key": "",
"deepseek_api_base": "https://api.deepseek.com/v1",
"qianfan_api_key": "",
"qianfan_api_base": "https://qianfan.baidubce.com/v2",
"minimax_api_key": "",
"zhipu_ai_api_key": "",
"ark_api_key": "",
@@ -24,8 +26,9 @@
"linkai_app_code": "",
"feishu_app_id": "",
"feishu_app_secret": "",
"feishu_stream_reply": true,
"dingtalk_client_id": "",
"dingtalk_client_secret":"",
"dingtalk_client_secret": "",
"wecom_bot_id": "",
"wecom_bot_secret": "",
"web_password": "",

View File

@@ -76,6 +76,9 @@ available_setting = {
"baidu_wenxin_api_key": "", # Baidu api key
"baidu_wenxin_secret_key": "", # Baidu secret key
"baidu_wenxin_prompt_enabled": False, # Enable prompt if you are using ernie character model
# Baidu Qianfan / ERNIE OpenAI-compatible API
"qianfan_api_key": "", # Baidu Qianfan API key in bce-v3 format
"qianfan_api_base": "https://qianfan.baidubce.com/v2", # Qianfan OpenAI-compatible API base
# 讯飞星火API
"xunfei_app_id": "", # 讯飞应用ID
"xunfei_api_key": "", # 讯飞 API key
@@ -123,10 +126,13 @@ available_setting = {
"chat_start_time": "00:00", # 服务开始时间
"chat_stop_time": "24:00", # 服务结束时间
# 翻译api
"translate": "baidu", # 翻译api支持baidu
"translate": "baidu", # 翻译api支持baidu, youdao
# baidu翻译api的配置
"baidu_translate_app_id": "", # 百度翻译api的appid
"baidu_translate_app_key": "", # 百度翻译api的秘钥
# youdao翻译api的配置
"youdao_translate_app_key": "", # 有道翻译api的应用ID
"youdao_translate_app_secret": "", # 有道翻译api的应用密钥
# wechatmp的配置
"wechatmp_token": "", # 微信公众平台的Token
"wechatmp_port": 8080, # 微信公众平台的端口,需要端口转发到80或443
@@ -142,12 +148,13 @@ available_setting = {
"wechatcomapp_agent_id": "", # 企业微信app的agent_id
"wechatcomapp_aes_key": "", # 企业微信app的aes_key
# 飞书配置
"feishu_port": 80, # 飞书bot监听端口
"feishu_port": 80, # 飞书bot监听端口仅webhook模式需要
"feishu_app_id": "", # 飞书机器人应用APP Id
"feishu_app_secret": "", # 飞书机器人APP secret
"feishu_token": "", # 飞书 verification token
"feishu_bot_name": "", # 飞书机器人的名字
"feishu_token": "", # 飞书 verification token仅webhook模式需要
"feishu_event_mode": "websocket", # 飞书事件接收模式: webhook(HTTP服务器) 或 websocket(长连接)
# 飞书流式回复(基于官方 cardkit 流式卡片 API需要机器人开通 cardkit:card:write 权限,且飞书客户端 7.20+
"feishu_stream_reply": True, # 是否开启流式回复(打字机效果)。失败/老客户端自动降级为非流式或升级提示
# 钉钉配置
"dingtalk_client_id": "", # 钉钉机器人Client ID
"dingtalk_client_secret": "", # 钉钉机器人Client Secret
@@ -228,13 +235,13 @@ class Config(dict):
def __getitem__(self, key):
# 跳过以下划线开头的注释字段
if not key.startswith("_") and key not in available_setting:
logger.warning("[Config] key '{}' not in available_setting, may not take effect".format(key))
logger.debug("[Config] key '{}' not in available_setting, may not take effect".format(key))
return super().__getitem__(key)
def __setitem__(self, key, value):
# 跳过以下划线开头的注释字段
if not key.startswith("_") and key not in available_setting:
logger.warning("[Config] key '{}' not in available_setting, may not take effect".format(key))
logger.debug("[Config] key '{}' not in available_setting, may not take effect".format(key))
return super().__setitem__(key, value)
def get(self, key, default=None):
@@ -386,6 +393,8 @@ def load_config():
"minimax_api_base": "MINIMAX_API_BASE",
"deepseek_api_key": "DEEPSEEK_API_KEY",
"deepseek_api_base": "DEEPSEEK_API_BASE",
"qianfan_api_key": "QIANFAN_API_KEY",
"qianfan_api_base": "QIANFAN_API_BASE",
"zhipu_ai_api_key": "ZHIPU_AI_API_KEY",
"zhipu_ai_api_base": "ZHIPU_AI_API_BASE",
"moonshot_api_key": "MOONSHOT_API_KEY",

View File

@@ -3,67 +3,109 @@ title: 飞书
description: 将 CowAgent 接入飞书应用
---
通过自建应用 CowAgent 接入飞书,需要是飞书企业用户且具有企业管理权限
> 通过飞书自建应用接入 CowAgent,支持单聊与群聊(@机器人),使用 WebSocket 长连接模式,无需公网 IP支持流式打字机回复、语音消息收发
## 一、创建企业自建应用
<Note>
接入需要是飞书企业用户且具有企业管理权限。
</Note>
### 1. 创建应用
## 一、接入方式
进入 [飞书开发平台](https://open.feishu.cn/app/),点击 **创建企业自建应用**,填写必要信息后点击 **创建**
### 方式一:扫码一键接入(推荐)
启动 Cow 项目后在终端中即可完成扫码创建。或打开 Web 控制台本地链接http://127.0.0.1:9899 ),选择 **通道** 菜单,点击 **接入通道**,选择 **飞书**,点击 **一键创建飞书应用**,使用 **飞书 App** 扫描二维码即可自动完成应用创建并接入:
<img src="https://cdn.link-ai.tech/doc/20260505181126.png" width="800"/>
<Note>
1. `lark-oapi` 依赖版本需要 >=1.5.5
2. 扫码创建出的应用会自动预置全部所需权限(消息收发、卡片读写、群聊事件等)和事件订阅,无需到开发者后台手动配置。
</Note>
### 方式二:手动创建接入
需要先在飞书开放平台创建自建应用并配置权限,再通过 Web 控制台或配置文件接入。
**步骤一:创建应用**
1. 进入 [飞书开发平台](https://open.feishu.cn/app/),点击 **创建企业自建应用**
<img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/feishu-hosting-create-app.jpg" width="500"/>
### 2. 添加机器人能力
在 **添加应用能力** 菜单中,为应用添加 **机器人** 能力:
2. 在 **添加应用能力** 中,为应用添加 **机器人** 能力
<img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/feishu-hosting-add-bot.jpg" width="800"/>
### 3. 配置应用权限
点击 **权限管理**,复制以下权限配置,粘贴到 **权限配置** 下方的输入框内,全选筛选出来的权限,点击 **批量开通** 并确认:
3. 在 **权限管理** 中,将以下权限粘贴到输入框,全选并 **批量开通**
```
im:message,im:message.group_at_msg,im:message.group_at_msg:readonly,im:message.p2p_msg,im:message.p2p_msg:readonly,im:message:send_as_bot,im:resource
im:message,im:message.group_at_msg,im:message.group_at_msg:readonly,im:message.p2p_msg,im:message.p2p_msg:readonly,im:message:send_as_bot,im:resource,cardkit:card:write
```
<img src="https://cdn.link-ai.tech/doc/feishu-hosting-add-auth2.png" width="800"/>
## 二、项目配置
1. 在 **凭证与基础信息** 中获取 `App ID` 和 `App Secret`
4. 在 **凭证与基础信息** 中获取 `App ID` 和 `App Secret`
<img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/feishu-hosting-appid-secret.jpg" width="800"/>
2. 将以下配置加入项目根目录的 `config.json` 文件:
**步骤二:接入 CowAgent**
```json
{
"channel_type": "feishu",
"feishu_app_id": "YOUR_APP_ID",
"feishu_app_secret": "YOUR_APP_SECRET",
"feishu_bot_name": "YOUR_BOT_NAME"
}
```
<Tabs>
<Tab title="Web 控制台">
打开 Web 控制台,选择 **通道** 菜单,点击 **接入通道**,选择 **飞书**切换到「手动填写」Tab输入 App ID 和 App Secret点击接入即可。
</Tab>
<Tab title="配置文件">
在 `config.json` 中添加以下配置后启动程序:
| 参数 | 说明 |
| --- | --- |
| `feishu_app_id` | 飞书机器人应用 App ID |
| `feishu_app_secret` | 飞书机器人 App Secret |
| `feishu_bot_name` | 飞书机器人名称(创建应用时设置),群聊中使用依赖此配置 |
```json
{
"channel_type": "feishu",
"feishu_app_id": "YOUR_APP_ID",
"feishu_app_secret": "YOUR_APP_SECRET",
"feishu_stream_reply": true
}
```
配置完成后启动项目。
| 参数 | 说明 | 默认值 |
| --- | --- | --- |
| `feishu_app_id` | 飞书应用 App ID | - |
| `feishu_app_secret` | 飞书应用 App Secret | - |
| `feishu_stream_reply` | 是否开启流式打字机回复 | `true` |
</Tab>
</Tabs>
## 三、配置事件订阅
**步骤三:发布应用**
1. 成功运行项目后,在飞书开放平台点击 **事件与回调**,选择 **长连接** 方式,点击保存:
1. 启动 Cow 项目后,在飞书开放平台点击 **事件与回调**,选择 **长连接** 模式并保存:
<img src="https://cdn.link-ai.tech/doc/202601311731183.png" width="600"/>
2. 点击下方的 **添加事件**,搜索 "接收消息",选择 "**接收消息v2.0**",确认添加
2. 点击 **添加事件**,搜索 "接收消息",选择 **接收消息 v2.0** 并确认
3. 点击 **版本管理与发布**,创建版本并申请 **线上发布**,在飞书客户端查看审批消息并审核通过:
3. 点击 **版本管理与发布**,创建版本并申请 **线上发布**,在飞书客户端审核通过:
<img src="https://cdn.link-ai.tech/doc/202601311807356.png" width="600"/>
完成后在飞书中搜索机器人名称,即可开始对话。
## 二、功能说明
| 功能 | 支持情况 |
| --- | --- |
| 单聊 | ✅ |
| 群聊(@机器人) | ✅ |
| 文本消息 | ✅ 收发 |
| 图片消息 | ✅ 收发 |
| 语音消息 | ✅ 收发 |
| 流式回复 | ✅(通过 `feishu_stream_reply` 配置控制,默认开启) |
<Note>
流式回复需要机器人具备 `cardkit:card:write` 权限(一键创建已默认开通),且接收方飞书客户端版本 ≥ 7.20。低版本客户端会显示升级提示,权限或版本不满足时自动降级为普通文本回复。
</Note>
## 三、使用
完成接入后,在飞书中搜索机器人名称即可开始单聊对话。
如需在群聊中使用,将机器人添加到群中,@机器人发送消息即可。

View File

@@ -81,6 +81,7 @@
"models/qwen",
"models/doubao",
"models/kimi",
"models/qianfan",
"models/linkai",
"models/coding-plan",
"models/custom"
@@ -208,6 +209,7 @@
"group": "发布记录",
"pages": [
"releases/overview",
"releases/v2.0.8",
"releases/v2.0.7",
"releases/v2.0.6",
"releases/v2.0.5",
@@ -266,6 +268,7 @@
"en/models/qwen",
"en/models/doubao",
"en/models/kimi",
"en/models/qianfan",
"en/models/linkai",
"en/models/coding-plan",
"en/models/custom"
@@ -392,10 +395,12 @@
"group": "Release Notes",
"pages": [
"en/releases/overview",
"en/releases/v2.0.8",
"en/releases/v2.0.7",
"en/releases/v2.0.6",
"en/releases/v2.0.5",
"en/releases/v2.0.4",
"en/releases/v2.0.3",
"en/releases/v2.0.2",
"en/releases/v2.0.1",
"en/releases/v2.0.0"
@@ -450,6 +455,7 @@
"ja/models/qwen",
"ja/models/doubao",
"ja/models/kimi",
"ja/models/qianfan",
"ja/models/linkai",
"ja/models/coding-plan",
"ja/models/custom"
@@ -577,6 +583,7 @@
"group": "リリースノート",
"pages": [
"ja/releases/overview",
"ja/releases/v2.0.8",
"ja/releases/v2.0.7",
"ja/releases/v2.0.6",
"ja/releases/v2.0.5",

View File

@@ -1,69 +1,107 @@
---
title: Feishu (Lark)
description: Integrate CowAgent into Feishu application
description: Integrate CowAgent into Feishu via a custom enterprise app
---
Integrate CowAgent into Feishu by creating a custom enterprise app. You need to be a Feishu enterprise user with admin privileges.
> Integrate CowAgent into Feishu via a custom enterprise app. Supports p2p chat and group chat (@bot), uses WebSocket long connection (no public IP needed), supports streaming typewriter replies and voice messages.
## 1. Create Enterprise Custom App
<Note>
You need to be a Feishu enterprise user with admin privileges.
</Note>
### 1.1 Create App
## 1. Setup
Go to [Feishu Developer Platform](https://open.feishu.cn/app/), click **Create Enterprise Custom App**, fill in the required information and click **Create**:
### Option 1: One-click Scan to Create (Recommended)
No need to manually create an app on the Feishu Developer Platform. Start the Cow project, open the web console (default `http://127.0.0.1:9899/`), go to **Channels**, click **Add Channel**, choose **Feishu**, then under the **Scan QR** tab click **One-click Create Feishu App** and scan with the **Feishu App** to complete app creation and connection automatically.
<Note>
The created app comes with all required permissions (messaging, card read/write, group events, etc.) and event subscriptions pre-configured. Currently only the Feishu mainland version is supported (Lark international not yet supported).
</Note>
When starting from CLI without `feishu_app_id` configured, the QR code is also printed to the terminal.
### Option 2: Manual Setup
Manually create a custom app on the Feishu Developer Platform, then connect via Web Console or config file.
**Step 1: Create the App**
1. Go to [Feishu Developer Platform](https://open.feishu.cn/app/), click **Create Enterprise Custom App**:
<img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/feishu-hosting-create-app.jpg" width="500"/>
### 1.2 Add Bot Capability
In **Add App Capabilities**, add **Bot** capability to the app:
2. In **Add App Capabilities**, add the **Bot** capability:
<img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/feishu-hosting-add-bot.jpg" width="800"/>
### 1.3 Configure App Permissions
Click **Permission Management**, paste the following permission string into the input box below **Permission Configuration**, select all filtered permissions, click **Batch Enable** and confirm:
3. In **Permission Management**, paste the following permissions and **Batch Enable** all:
```
im:message,im:message.group_at_msg,im:message.group_at_msg:readonly,im:message.p2p_msg,im:message.p2p_msg:readonly,im:message:send_as_bot,im:resource
im:message,im:message.group_at_msg,im:message.group_at_msg:readonly,im:message.p2p_msg,im:message.p2p_msg:readonly,im:message:send_as_bot,im:resource,cardkit:card:write
```
<img src="https://cdn.link-ai.tech/doc/feishu-hosting-add-auth2.png" width="800"/>
## 2. Project Configuration
1. Get `App ID` and `App Secret` from **Credentials & Basic Info**:
4. Get `App ID` and `App Secret` from **Credentials & Basic Info**:
<img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/feishu-hosting-appid-secret.jpg" width="800"/>
2. Add the following configuration to `config.json` in the project root:
**Step 2: Connect to CowAgent**
```json
{
"channel_type": "feishu",
"feishu_app_id": "YOUR_APP_ID",
"feishu_app_secret": "YOUR_APP_SECRET",
"feishu_bot_name": "YOUR_BOT_NAME"
}
```
<Tabs>
<Tab title="Web Console">
Open the web console, go to **Channels**, click **Add Channel**, choose **Feishu**, switch to the **Manual** tab, enter App ID and App Secret, then click connect.
</Tab>
<Tab title="Config File">
Add the following to `config.json` and start the program:
| Parameter | Description |
| --- | --- |
| `feishu_app_id` | Feishu bot App ID |
| `feishu_app_secret` | Feishu bot App Secret |
| `feishu_bot_name` | Bot name (set when creating the app), required for group chat usage |
```json
{
"channel_type": "feishu",
"feishu_app_id": "YOUR_APP_ID",
"feishu_app_secret": "YOUR_APP_SECRET",
"feishu_stream_reply": true
}
```
Start the project after configuration is complete.
| Parameter | Description | Default |
| --- | --- | --- |
| `feishu_app_id` | Feishu app App ID | - |
| `feishu_app_secret` | Feishu app App Secret | - |
| `feishu_stream_reply` | Enable streaming typewriter reply | `true` |
</Tab>
</Tabs>
## 3. Configure Event Subscription
**Step 3: Publish the App**
1. After the project is running successfully, go to the Feishu Developer Platform, click **Events & Callbacks**, select **Long Connection** mode, and click save:
1. After Cow is running, go to **Events & Callbacks** in the Feishu Developer Platform, choose **Long Connection** mode and save:
<img src="https://cdn.link-ai.tech/doc/202601311731183.png" width="600"/>
2. Click **Add Event** below, search for "Receive Message", select "**Receive Message v2.0**", and confirm.
2. Click **Add Event**, search for "Receive Message" and choose **Receive Message v2.0**.
3. Click **Version Management & Release**, create a new version and apply for **Production Release**. Check the approval message in the Feishu client and approve:
3. Click **Version Management & Release**, create a version and apply for **Production Release**. Approve the request in the Feishu client:
<img src="https://cdn.link-ai.tech/doc/202601311807356.png" width="600"/>
Once completed, search for the bot name in Feishu to start chatting.
## 2. Features
| Feature | Status |
| --- | --- |
| P2P chat | ✅ |
| Group chat (@bot) | ✅ |
| Text messages | ✅ send/receive |
| Image messages | ✅ send/receive |
| Voice messages | ✅ send/receive |
| Streaming reply | ✅ (powered by Feishu cardkit streaming card) |
<Note>
Streaming reply requires the `cardkit:card:write` permission (already enabled by one-click creation) and Feishu client version ≥ 7.20. Older clients see an upgrade prompt; if the permission or version is not satisfied, replies fall back to plain text automatically.
</Note>
## 3. Usage
After connection, search for the bot name in Feishu to start a chat.
To use in groups, add the bot to a group and @-mention it.

View File

@@ -6,7 +6,7 @@ description: Supported models and recommended choices for CowAgent
CowAgent supports mainstream LLMs from domestic and international providers. Model interfaces are implemented in the project's `models/` directory.
<Note>
For Agent mode, the following models are recommended based on quality and cost: deepseek-v4-flash, MiniMax-M2.7, claude-sonnet-4-6, gemini-3.1-pro-preview, glm-5.1, qwen3.6-plus, kimi-k2.6
For Agent mode, the following models are recommended based on quality and cost: deepseek-v4-flash, MiniMax-M2.7, claude-sonnet-4-6, gemini-3.1-pro-preview, glm-5.1, qwen3.6-plus, kimi-k2.6, ernie-5.0
</Note>
## Configuration
@@ -21,6 +21,9 @@ You can also use the [LinkAI](https://link-ai.tech) platform interface to flexib
<Card title="DeepSeek" href="/en/models/deepseek">
deepseek-v4-flash, deepseek-v4-pro, and more
</Card>
<Card title="Baidu Qianfan / ERNIE" href="/en/models/qianfan">
ernie-5.0, ernie-4.5-turbo-128k, and more
</Card>
<Card title="MiniMax" href="/en/models/minimax">
MiniMax-M2.7 and other series models
</Card>

View File

@@ -0,0 +1,63 @@
---
title: Baidu Qianfan / ERNIE
description: Baidu Qianfan ERNIE model configuration
---
Option 1: Native integration (recommended):
```json
{
"model": "ernie-5.0",
"qianfan_api_key": "",
"qianfan_api_base": "https://qianfan.baidubce.com/v2"
}
```
| Parameter | Description |
| --- | --- |
| `model` | Default recommendation: `ernie-5.0`; also supports `ernie-x1.1`, `ernie-4.5-turbo-128k`, `ernie-4.5-turbo-32k` |
| `qianfan_api_key` | Qianfan API key, usually starting with `bce-v3/` |
| `qianfan_api_base` | Optional, defaults to `https://qianfan.baidubce.com/v2` |
## Model Selection
| Model | Use Case |
| --- | --- |
| `ernie-5.0` | Default recommendation; latest ERNIE flagship with the strongest overall capability |
| `ernie-x1.1` | Deep-thinking reasoning model with lower hallucination and stronger instruction following / tool calling |
| `ernie-4.5-turbo-128k` | Long-context and general chat |
| `ernie-4.5-turbo-32k` | General chat with a balanced context window and cost |
## Vision tool
Once `qianfan_api_key` is configured, Agent mode can auto-discover Qianfan for the Vision tool:
- When the main model itself is multimodal (e.g. `ernie-5.0`, `ernie-x1.1`, `ernie-4.5-turbo-vl`), images are handled directly by the main model with no extra setup.
- When the main model is text-only (e.g. `ernie-4.5-turbo-128k`), the Vision tool automatically falls back to `ernie-4.5-turbo-vl`.
To force a specific Vision model, set it explicitly in `config.json`:
```json
{
"tool": {
"vision": {
"model": "ernie-4.5-turbo-vl"
}
}
}
```
Option 2: OpenAI-compatible configuration:
```json
{
"model": "ernie-5.0",
"bot_type": "openai",
"open_ai_api_key": "",
"open_ai_api_base": "https://qianfan.baidubce.com/v2"
}
```
<Tip>
Prefer `qianfan_api_key` for new configurations. Existing `wenxin`, `wenxin-4`, `baidu_wenxin_api_key`, and `baidu_wenxin_secret_key` configurations remain supported.
</Tip>

View File

@@ -0,0 +1,91 @@
---
title: v2.0.3
description: CowAgent 2.0.3 - WeCom Smart Bot and QQ channels, Web Console file handling, memory system upgrade
---
## 🔌 New Channels
### WeCom Smart Bot
Added the WeCom Smart Bot (`wecom_bot`) channel with streaming card output, support for receiving and replying to text and image messages, and full configuration through the Web Console.
Documentation: [WeCom Smart Bot](https://docs.cowagent.ai/en/channels/wecom-bot).
Related commits: [d4480b6](https://github.com/zhayujie/CowAgent/commit/d4480b6), [a42f31f](https://github.com/zhayujie/CowAgent/commit/a42f31f), [4ecd4df](https://github.com/zhayujie/CowAgent/commit/4ecd4df), [8b45d6c](https://github.com/zhayujie/CowAgent/commit/8b45d6c)
### QQ Channel
Added the QQ official bot (`qq`) channel with support for text and image messages in both private chats and group chats.
Documentation: [QQ Bot](https://docs.cowagent.ai/en/channels/qq).
Related commits: [005a0e1](https://github.com/zhayujie/CowAgent/commit/005a0e1), [a4d54f5](https://github.com/zhayujie/CowAgent/commit/a4d54f5)
## 🖥️ Web Console File Input and Processing
The Web Console chat UI now supports file and image uploads — files can be sent directly to the agent for processing. The Read tool gains parsing support for Office documents (Word, Excel, PPT).
Related commits: [30c6d9b](https://github.com/zhayujie/CowAgent/commit/30c6d9b)
## 🤖 New Models
- **GPT-5.4 Series**: Added `gpt-5.4`, `gpt-5.4-mini`, `gpt-5.4-nano` ([1623deb](https://github.com/zhayujie/CowAgent/commit/1623deb))
- **Gemini 3.1 Flash Lite Preview**: Added `gemini-3.1-flash-lite-preview` ([ba915f2](https://github.com/zhayujie/CowAgent/commit/ba915f2))
## 💰 Coding Plan Support
Added integration with vendor Coding Plan (monthly programming subscription) tiers via the unified OpenAI-compatible path. Supported vendors include Aliyun, MiniMax, Zhipu GLM, Kimi, and Volcengine.
See [Coding Plan docs](https://docs.cowagent.ai/en/models/coding-plan) for detailed configuration.
## 🧠 Memory System Upgrade
Memory flush improvements:
- Use the LLM to summarize out-of-window conversations into compact daily memory entries
- Summarization runs asynchronously on a background thread, never blocking replies
- Smarter batch trimming policy reduces flush frequency
- Daily scheduled flush as a safety net for low-activity scenarios
- Fixed context-memory loss issues
Related commits: [022c13f](https://github.com/zhayujie/CowAgent/commit/022c13f), [c116235](https://github.com/zhayujie/CowAgent/commit/c116235)
## 🔧 Tool Refactoring
- **Image Vision**: Image recognition (Vision) is refactored from a Skill into a built-in Tool with a dedicated Vision Provider configuration, improving stability and maintainability ([a50fafa](https://github.com/zhayujie/CowAgent/commit/a50fafa), [3b8b562](https://github.com/zhayujie/CowAgent/commit/3b8b562))
- **Web Fetch**: Web fetch is refactored from a Skill into a built-in Tool with support for downloading and parsing remote documents (PDF, Word, Excel, PPT) ([ccb9030](https://github.com/zhayujie/CowAgent/commit/ccb9030), [fa61744](https://github.com/zhayujie/CowAgent/commit/fa61744))
## 🐳 Docker Deployment Improvements
- **Config Template Alignment**: `docker-compose.yml` env vars aligned with `config-template.json`, covering full model API key and Agent settings
- **Web Console Port Mapping**: Added `9899` port mapping so the Web Console is reachable in browser after Docker deployment
- **Hot Config Reload**: Bot API key and API base are now read at request time — changes from the Web Console take effect without restart
- **Workspace Persistence**: Added a `./cow` volume mount so agent workspace data (memories, persona, skills, etc.) persists across container rebuilds and upgrades
## ⚡ Performance Improvements
- **Faster Startup**: The Feishu channel imports its dependencies lazily, avoiding a 410s startup delay ([924dc79](https://github.com/zhayujie/CowAgent/commit/924dc79))
- **Channel Stability**: Improved channel connection stability and added env-var support for channel configuration ([f1c04bc](https://github.com/zhayujie/CowAgent/commit/f1c04bc), [46d97fd](https://github.com/zhayujie/CowAgent/commit/46d97fd))
## 🐛 Bug Fixes
- **bot_type Propagation**: Fixed `bot_type` propagation under Agent mode ([#2691](https://github.com/zhayujie/CowAgent/pull/2691)) Thanks [@Weikjssss](https://github.com/Weikjssss)
- **bot_type Resolution Priority**: Adjusted `bot_type` resolution priority under Agent mode ([#2692](https://github.com/zhayujie/CowAgent/pull/2692)) Thanks [@6vision](https://github.com/6vision)
- **Zhipu Config**: Fixed Zhipu `bot_type` naming, Web Console persistence, and regex escaping ([#2693](https://github.com/zhayujie/CowAgent/pull/2693)) Thanks [@6vision](https://github.com/6vision)
- **OpenAI-Compat Layer**: Unified error handling via the `openai_compat` layer ([#2688](https://github.com/zhayujie/CowAgent/pull/2688)) Thanks [@JasonOA888](https://github.com/JasonOA888)
- **OpenAI-Compat Migration**: Completed the `openai_compat` migration across all model bots ([#2689](https://github.com/zhayujie/CowAgent/pull/2689))
- **Gemini Tool Calling**: Fixed tool-call matching for Gemini ([eda82ba](https://github.com/zhayujie/CowAgent/commit/eda82ba))
- **Session Concurrency**: Fixed race conditions in concurrent session scenarios ([9879878](https://github.com/zhayujie/CowAgent/commit/9879878))
- **History Recovery**: Fixed incomplete history recovery — only user/assistant text messages are restored, tool calls are stripped ([b788a3d](https://github.com/zhayujie/CowAgent/commit/b788a3d), [a33ce97](https://github.com/zhayujie/CowAgent/commit/a33ce97))
- **Feishu Group Chat**: Removed the `bot_name` dependency for Feishu group chats ([b641bff](https://github.com/zhayujie/CowAgent/commit/b641bff))
- **Safari Compatibility**: Fixed an IME Enter key issue that mistakenly sent messages on Safari ([0687916](https://github.com/zhayujie/CowAgent/commit/0687916))
- **Windows Compatibility**: Fixed bash-style `$VAR` to `%VAR%` env-var conversion on Windows ([7c67513](https://github.com/zhayujie/CowAgent/commit/7c67513))
- **MiniMax Params**: Added a `max_tokens` cap for MiniMax models ([1767413](https://github.com/zhayujie/CowAgent/commit/1767413))
- **.gitignore**: Added Python directory ignore rules ([#2683](https://github.com/zhayujie/CowAgent/pull/2683)) Thanks [@pelioo](https://github.com/pelioo)
- **AGENT.md Proactive Evolution**: Improved the system prompt guidance around AGENT.md — instead of waiting for explicit user edits, the agent now proactively detects persona/style shifts in the conversation and updates AGENT.md accordingly
## 📦 Upgrade
Run `./run.sh update` for a one-click upgrade, or manually pull the latest code and restart. See [Upgrade Guide](https://docs.cowagent.ai/en/guide/upgrade) for details.
**Release Date**: 2026.03.18 | [Full Changelog](https://github.com/zhayujie/CowAgent/compare/2.0.2...2.0.3)

View File

@@ -0,0 +1,68 @@
---
title: v2.0.8
description: CowAgent 2.0.8 - Major Feishu channel upgrade (voice, streaming typewriter, one-click QR app creation), DeepSeek V4 / Baidu Qianfan ERNIE 5.0 support, scheduler memory enhancements and multiple fixes
---
## 🪶 Major Feishu Channel Upgrade
### 1. One-click QR-scan App Creation
No more manual app setup, permission scopes and event subscriptions in the Feishu Open Platform. When `feishu_app_id` is not configured, both the Web Console and CLI startup flow now show a QR-scan entry — scan with Feishu, authorize, and the bot is created and config is filled back automatically. Out-of-the-box.
Documentation: [Feishu Channel](https://docs.cowagent.ai/en/channels/feishu)
### 2. Voice Messages
Receive Feishu voice messages with automatic speech-to-text, and reply in voice via TTS. Recognition accuracy for short Chinese voice messages has been improved.
### 3. Streaming Typewriter Replies
Integrated with Feishu CardKit streaming cards, **enabled by default**, matching the Web Console experience:
- Multi-turn agent flows render intermediate updates and the final reply on separate cards
- Tuned for high-throughput models like DeepSeek to keep pace with the Web Console
- Falls back to plain text replies automatically when not supported, no manual config needed
- Requires Feishu client ≥ 7.20
The voice and streaming building blocks come from a community contribution #2791. Thanks [@ooaaooaa123](https://github.com/ooaaooaa123)
## 🤖 New Model Support
- **DeepSeek V4 series**: Added `deepseek-v4-pro` / `deepseek-v4-flash`, with `deepseek-v4-flash` set as the new default
- **Unified thinking-mode toggle**: DeepSeek V4, Qwen3 and other thinking-capable models now share the same `enable_thinking` switch
- **Baidu Qianfan / ERNIE first-class integration**: New `qianfan` provider supporting `ernie-5.0` (default recommendation), `ernie-x1.1`, `ernie-4.5-turbo-128k`, `ernie-4.5-turbo-32k`. Dedicated `qianfan_api_key` / `qianfan_api_base` settings keep OpenAI config clean; legacy `wenxin` / `wenxin-4` paths are fully preserved. #2790 Thanks [@jimmyzhuu](https://github.com/jimmyzhuu)
Documentation: [Baidu Qianfan / ERNIE](https://docs.cowagent.ai/en/models/qianfan)
## 🌐 Translation Provider
- **Youdao translator**: Added a Youdao provider to the `translate/` module using the v3 SHA-256 signing scheme, with automatic ISO 639-1 language-code mapping (`zh`, `zh-TW`, etc.) #2797 Thanks [@Zmjjeff7](https://github.com/Zmjjeff7)
## 🛠 OpenAI Client Refactor
- **Drop SDK dependency**: The OpenAI bot is reimplemented on a native HTTP client — leaner startup, fewer dependency conflicts
- **Web Console hint**: API base inputs in the model config UI now include version-path placeholder hints
## ⏰ Scheduler Memory Enhancements
- **Follow-up on task results**: Scheduled task results are automatically injected into the receiver's session history — the next turn can ask follow-up questions without re-stating context. Thanks [@huangrichao2020](https://github.com/huangrichao2020)
- **No long-term memory pollution**: Scheduler-injected pairs are excluded from the daily memory flush so high-frequency tasks don't drown the memory store
- **Bounded scheduler context**: The scheduler's own session context is automatically capped, so long-running periodic tasks don't accumulate state and slow down replies
## 🔧 Tools and Safety
- **Vision model selection**: `tool.vision.model` config now actually takes effect, with automatic fallback when unconfigured #2792
- **Bash safety prompt**: The destructive-deletion confirm prompt is now scoped to paths outside the workspace — routine in-workspace operations are no longer interrupted
## 🐛 Other Fixes
- Fixed Deep Dream firing duplicate runs in multi-instance setups
- Fixed missing `reasoning_content` on some history turns in DeepSeek multi-turn conversations
## 📦 Upgrade
Source-code deployments can run `cow update` or `./run.sh update` for a one-click upgrade, or pull the latest code and restart manually. See [Upgrade Guide](https://docs.cowagent.ai/en/guide/upgrade) for details.
> ⚠️ One-click Feishu app creation requires `lark-oapi>=1.5.5`. `cow update` pulls it automatically; manual deployments must update dependencies.
**Release Date**: 2026.05.05 | [Full Changelog](https://github.com/zhayujie/CowAgent/compare/2.0.7...2.0.8)

View File

@@ -23,6 +23,7 @@ If the current provider fails, the tool automatically tries the next one until i
| Vendor | Vision Model | Notes |
| --- | --- | --- |
| OpenAI / Compatible | Main model | All OpenAI-compatible multimodal models |
| Baidu Qianfan | Main model | Multimodal main models (e.g. `ernie-5.0`) handle images directly; falls back to `ernie-4.5-turbo-vl` for text-only main models |
| Qwen (DashScope) | Main model | Via MultiModalConversation API |
| Claude | Main model | Anthropic native image format |
| Gemini | Main model | inlineData format |
@@ -52,7 +53,7 @@ To specify a particular model for the vision tool, add to `config.json`:
{
"tool": {
"vision": {
"model": "gpt-4o"
"model": "ernie-4.5-turbo-vl"
}
}
}

View File

@@ -1,69 +1,107 @@
---
title: Feishu (Lark)
description: CowAgent を Feishu アプリケーションに統合する
description: 企業向けカスタムアプリで CowAgent を Feishu に接続
---
企業向けカスタムアプリを作成して、CowAgent を Feishu に統合します。管理者権限を持つ Feishu 企業ユーザーである必要があります。
> 飛書Feishu企業向けカスタムアプリを通じて CowAgent を接続。1 対 1 チャット、グループチャット(@メンションに対応。WebSocket 長接続を使用するため公開 IP 不要、ストリーミングのタイプライター応答や音声メッセージにも対応します。
## 1. 企業カスタムアプリの作成
<Note>
接続には管理者権限を持つ Feishu 企業ユーザーが必要です。
</Note>
### 1.1 アプリの作成
## 1. 接続方法
[Feishu 開発者プラットフォーム](https://open.feishu.cn/app/)にアクセスし、**企業カスタムアプリを作成**をクリックして、必要な情報を入力し**作成**をクリックします:
### 方式 1: ワンクリック作成(推奨)
事前に Feishu 開発者プラットフォームでアプリを作成する必要はありません。Cow を起動後、Web コンソール(既定 `http://127.0.0.1:9899/`)を開き、**チャネル** メニュー → **チャネルを追加** → **Feishu** を選択し、**QR スキャン** タブで **ワンクリックで Feishu アプリを作成** をクリック。**Feishu アプリ** で QR コードをスキャンするとアプリ作成と接続が自動完了します。
<Note>
作成されたアプリには必要な権限(メッセージ送受信、カード読み書き、グループイベントなど)とイベント購読がすべて事前設定されています。現在は Feishu 中国版のみ対応で、Lark 国際版は未対応です。
</Note>
CLI から `feishu_app_id` 未設定で起動した場合は、ターミナルにも QR コードが表示されます。
### 方式 2: 手動作成
Feishu 開発者プラットフォームで自分でアプリを作成し、Web コンソールまたは設定ファイルから接続します。
**ステップ 1: アプリ作成**
1. [Feishu 開発者プラットフォーム](https://open.feishu.cn/app/) にアクセスし、**企業カスタムアプリを作成** をクリック:
<img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/feishu-hosting-create-app.jpg" width="500"/>
### 1.2 Bot 機能追加
**アプリ機能の追加**で、アプリに **Bot** 機能を追加します:
2. **アプリ機能の追加** で **Bot** 機能追加:
<img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/feishu-hosting-add-bot.jpg" width="800"/>
### 1.3 アプリ権限の設定
**権限管理**をクリックし、**権限設定**の下の入力欄に以下の権限文字列を貼り付け、フィルタされたすべての権限を選択し、**一括有効化**をクリックして確認します:
3. **権限管理** で以下の権限を貼り付け、全選択して **一括有効化**:
```
im:message,im:message.group_at_msg,im:message.group_at_msg:readonly,im:message.p2p_msg,im:message.p2p_msg:readonly,im:message:send_as_bot,im:resource
im:message,im:message.group_at_msg,im:message.group_at_msg:readonly,im:message.p2p_msg,im:message.p2p_msg:readonly,im:message:send_as_bot,im:resource,cardkit:card:write
```
<img src="https://cdn.link-ai.tech/doc/feishu-hosting-add-auth2.png" width="800"/>
## 2. プロジェクト設定
1. **認証情報と基本情報**から `App ID` と `App Secret` を取得します:
4. **認証情報と基本情報** から `App ID` と `App Secret` を取得:
<img src="https://img-1317903499.cos.ap-guangzhou.myqcloud.com/docs/feishu-hosting-appid-secret.jpg" width="800"/>
2. プロジェクトルートの `config.json` に以下の設定を追加します:
**ステップ 2: CowAgent に接続**
```json
{
"channel_type": "feishu",
"feishu_app_id": "YOUR_APP_ID",
"feishu_app_secret": "YOUR_APP_SECRET",
"feishu_bot_name": "YOUR_BOT_NAME"
}
```
<Tabs>
<Tab title="Web コンソール">
Web コンソールから **チャネル** → **チャネルを追加** → **Feishu** → **手動入力** タブに切り替え、App ID と App Secret を入力して接続。
</Tab>
<Tab title="設定ファイル">
`config.json` に以下を追加して起動:
| パラメータ | 説明 |
| --- | --- |
| `feishu_app_id` | Feishu Bot の App ID |
| `feishu_app_secret` | Feishu Bot の App Secret |
| `feishu_bot_name` | Bot 名(アプリ作成時に設定)、グループチャットで使用する際に必要 |
```json
{
"channel_type": "feishu",
"feishu_app_id": "YOUR_APP_ID",
"feishu_app_secret": "YOUR_APP_SECRET",
"feishu_stream_reply": true
}
```
設定完了後、プロジェクトを起動します。
| パラメータ | 説明 | デフォルト |
| --- | --- | --- |
| `feishu_app_id` | Feishu アプリの App ID | - |
| `feishu_app_secret` | Feishu アプリの App Secret | - |
| `feishu_stream_reply` | ストリーミングタイプライター応答を有効化 | `true` |
</Tab>
</Tabs>
## 3. イベントサブスクリプションの設定
**ステップ 3: アプリの公開**
1. プロジェクトが正常に動作した後、Feishu 開発者プラットフォームに移動し、**イベントとコールバック**をクリックし、**ロングコネクション**モードを選択して保存をクリックします:
1. Cow 起動後、Feishu 開発者プラットフォーム**イベントとコールバック****ロングコネクション** モードを選択して保存:
<img src="https://cdn.link-ai.tech/doc/202601311731183.png" width="600"/>
2. 下の**イベントを追加**をクリックし、「メッセージ受信」を検索して「**メッセージ受信 v2.0**を選択し、確認します
2. **イベントを追加**「メッセージ受信」を検索し**メッセージ受信 v2.0** を選択。
3. **バージョン管理とリリース**をクリックし、新しいバージョンを作成し**本番リリース**を申請します。Feishu クライアントで承認メッセージを確認し、承認します:
3. **バージョン管理とリリース** で新バージョンを作成し **本番リリース** を申請Feishu クライアントで承認:
<img src="https://cdn.link-ai.tech/doc/202601311807356.png" width="600"/>
完了後、Feishu で Bot 名を検索してチャットを開始できます。
## 2. 機能一覧
| 機能 | 対応状況 |
| --- | --- |
| 1 対 1 チャット | ✅ |
| グループチャット(@Bot | ✅ |
| テキストメッセージ | ✅ 送受信 |
| 画像メッセージ | ✅ 送受信 |
| 音声メッセージ | ✅ 送受信 |
| ストリーミング応答 | ✅Feishu cardkit ストリーミングカードベース) |
<Note>
ストリーミング応答には `cardkit:card:write` 権限(ワンクリック作成では自動付与)と Feishu クライアント 7.20 以上が必要です。古いクライアントではアップグレード案内が表示され、権限/バージョン未充足時は通常テキスト応答に自動フォールバックします。
</Note>
## 3. 使い方
接続完了後、Feishu で Bot 名を検索してチャットを開始できます。
グループで使う場合は Bot をグループに追加し、@メンションでメッセージを送ってください。

View File

@@ -6,7 +6,7 @@ description: CowAgentがサポートするモデルとおすすめの選択肢
CowAgentは国内外の主要なLLMをサポートしています。モデルインターフェースはプロジェクトの`models/`ディレクトリに実装されています。
<Note>
Agent モードでは、品質とコストのバランスから以下のモデルをおすすめします: deepseek-v4-flash、MiniMax-M2.7、claude-sonnet-4-6、gemini-3.1-pro-preview、glm-5.1、qwen3.6-plus、kimi-k2.6
Agent モードでは、品質とコストのバランスから以下のモデルをおすすめします: deepseek-v4-flash、MiniMax-M2.7、claude-sonnet-4-6、gemini-3.1-pro-preview、glm-5.1、qwen3.6-plus、kimi-k2.6、ernie-5.0
</Note>
## 設定
@@ -21,6 +21,9 @@ CowAgentは国内外の主要なLLMをサポートしています。モデルイ
<Card title="DeepSeek" href="/ja/models/deepseek">
deepseek-v4-flash、deepseek-v4-pro など
</Card>
<Card title="Baidu Qianfan / ERNIE" href="/ja/models/qianfan">
ernie-5.0、ernie-4.5-turbo-128k など
</Card>
<Card title="MiniMax" href="/ja/models/minimax">
MiniMax-M2.7およびその他のシリーズモデル
</Card>

View File

@@ -0,0 +1,63 @@
---
title: Baidu Qianfan / ERNIE
description: Baidu Qianfan ERNIE モデル設定
---
方法 1: 公式接続(推奨):
```json
{
"model": "ernie-5.0",
"qianfan_api_key": "",
"qianfan_api_base": "https://qianfan.baidubce.com/v2"
}
```
| パラメータ | 説明 |
| --- | --- |
| `model` | デフォルトの推奨は `ernie-5.0`。`ernie-x1.1`、`ernie-4.5-turbo-128k`、`ernie-4.5-turbo-32k` も利用できます |
| `qianfan_api_key` | Qianfan API Key。通常は `bce-v3/` で始まります |
| `qianfan_api_base` | 任意。デフォルトは `https://qianfan.baidubce.com/v2` |
## モデル選択
| モデル | 用途 |
| --- | --- |
| `ernie-5.0` | デフォルト推奨。文心の最新フラッグシップモデルで、総合性能が最も強い |
| `ernie-x1.1` | 深層推論モデル。ハルシネーションが少なく、指示追従とツール呼び出しが強化 |
| `ernie-4.5-turbo-128k` | 長いコンテキストと一般的なチャット向け |
| `ernie-4.5-turbo-32k` | コンテキスト長とコストのバランスが良い一般チャット向け |
## Vision ツール
`qianfan_api_key` を設定すると、Agent モードの Vision ツールは Qianfan を自動検出します:
- 主モデルが多モーダル(`ernie-5.0`、`ernie-x1.1`、`ernie-4.5-turbo-vl` など)の場合は、追加設定なしで主モデルがそのまま画像を処理します。
- 主モデルがテキスト専用(`ernie-4.5-turbo-128k` などの場合は、Vision ツールが自動的に `ernie-4.5-turbo-vl` にフォールバックします。
特定の Vision モデルを強制したい場合は、`config.json` で明示的に指定できます:
```json
{
"tool": {
"vision": {
"model": "ernie-4.5-turbo-vl"
}
}
}
```
方法 2: OpenAI 互換接続:
```json
{
"model": "ernie-5.0",
"bot_type": "openai",
"open_ai_api_key": "",
"open_ai_api_base": "https://qianfan.baidubce.com/v2"
}
```
<Tip>
新しい設定では `qianfan_api_key` の利用を推奨します。既存の `wenxin`、`wenxin-4`、`baidu_wenxin_api_key`、`baidu_wenxin_secret_key` 設定は引き続き利用できます。
</Tip>

View File

@@ -0,0 +1,68 @@
---
title: v2.0.8
description: CowAgent 2.0.8 - 飛書チャネル全面アップグレード(音声、ストリーミングタイプライター、ワンクリック QR アプリ作成、DeepSeek V4 / 百度千帆 ERNIE 5.0 サポート、スケジューラ記憶強化および複数の修正
---
## 🪶 飛書チャネル全面アップグレード
### 1. ワンクリック QR スキャンでアプリ作成
飛書オープンプラットフォームで手動でアプリを作成し、権限とイベントサブスクリプションを設定する必要がなくなりました。Web コンソールおよびコマンドライン起動時に `feishu_app_id` が未設定の場合、QR スキャン入口が自動的に表示されます。飛書でスキャン・認可するとボットが自動作成され、設定が自動で書き戻され、すぐに使い始められます。
ドキュメント:[飛書チャネル](https://docs.cowagent.ai/ja/channels/feishu)
### 2. 音声メッセージ送受信
ユーザーから送られた飛書の音声メッセージを受信し、自動的にテキストへ変換できるようになりました。返信も TTS による音声形式に対応。中国語の短い音声メッセージの認識精度も改善されています。
### 3. ストリーミングタイプライター返信
飛書 CardKit ストリーミングカードを統合し、**デフォルト有効**で Web コンソールと同等の体験を提供:
- マルチターンの Agent シナリオで、中間メッセージと最終回答を別カードで表示
- DeepSeek など高頻度出力モデル向けに最適化、Web コンソールと同等の速度を実現
- 非対応時は自動的に通常のテキスト返信にフォールバック、手動設定不要
- 飛書クライアント ≥ 7.20 が必要
飛書の音声メッセージ送受信とストリーミングタイプライターのベース機能はコミュニティ貢献 #2791 によるものです。Thanks [@ooaaooaa123](https://github.com/ooaaooaa123)
## 🤖 新モデルサポート
- **DeepSeek V4 シリーズ**`deepseek-v4-pro` / `deepseek-v4-flash` を追加、デフォルトモデルを `deepseek-v4-flash` に切り替え
- **思考モデルスイッチの統一**DeepSeek V4、Qwen3 など思考対応モデルの切り替え動作を `enable_thinking` に統一
- **百度千帆 / ERNIE のファーストクラス対応**:新たな `qianfan` プロバイダーを追加。`ernie-5.0`(デフォルト推奨)、`ernie-x1.1`、`ernie-4.5-turbo-128k`、`ernie-4.5-turbo-32k` をサポート。`qianfan_api_key` / `qianfan_api_base` の独立設定により OpenAI 設定を汚染せず、旧来の `wenxin` / `wenxin-4` パスも完全互換 #2790 Thanks [@jimmyzhuu](https://github.com/jimmyzhuu)
ドキュメント:[百度千帆 / ERNIE](https://docs.cowagent.ai/ja/models/qianfan)
## 🌐 翻訳プロバイダー
- **有道翻訳を追加**`translate/` モジュールに有道翻訳プロバイダーを追加。v3 SHA-256 署名方式に対応し、`zh` / `zh-TW` などの ISO 639-1 言語コードを自動マッピング #2797 Thanks [@Zmjjeff7](https://github.com/Zmjjeff7)
## 🛠 OpenAI クライアントのリファクタリング
- **SDK 依存を排除**OpenAI Bot をネイティブ HTTP クライアントに刷新、起動が軽量化、依存衝突も削減
- **Web コンソールヒント**:モデル設定の API Base 入力欄にバージョンパスのプレースホルダーヒントを追加
## ⏰ スケジューラ記憶強化
- **タスク結果への追問**:定期タスクの実行結果を受信側のセッション履歴に自動注入。次のターンでコンテキストを再説明することなくそのまま追問可能 Thanks [@huangrichao2020](https://github.com/huangrichao2020)
- **長期記憶を汚染しない**:注入されたスケジューラ対話は毎日の記憶フラッシュ対象から除外され、高頻度タスクで記憶ストアが埋まることを防止
- **遅くなり続ける問題を回避**:スケジューラ自身のコンテキスト長を自動制限、長期反復実行でも蓄積して応答を遅延させません
## 🔧 ツールと安全性
- **Vision モデル選択**`tool.vision.model` 設定が実際に反映されるようになり、未設定時は自動フォールバック #2792
- **Bash セーフティ確認**:破壊的削除の確認プロンプトをワークスペース外のパスに限定。ワークスペース内の通常操作は中断されません
## 🐛 その他の修正
- マルチインスタンス環境で Deep Dream が重複実行される問題を修正
- DeepSeek マルチターン会話の一部の履歴ターンで `reasoning_content` が欠落する問題を修正
## 📦 アップグレード
ソースコードデプロイは `cow update` または `./run.sh update` でワンクリックアップグレード、または最新コードを手動で pull して再起動してください。詳細は[アップグレードガイド](https://docs.cowagent.ai/ja/guide/upgrade)を参照。
> ⚠️ 飛書のワンクリックアプリ作成は `lark-oapi>=1.5.5` が必要です。`cow update` は自動で取得します。手動デプロイの場合は依存関係の更新を確認してください。
**リリース日**2026.05.05 | [Full Changelog](https://github.com/zhayujie/CowAgent/compare/2.0.7...2.0.8)

View File

@@ -23,6 +23,7 @@ Vision ツールは多段階の自動選択+自動フォールバック戦略
| ベンダー | ビジョンモデル | 説明 |
| --- | --- | --- |
| OpenAI / 互換プロトコル | メインモデル | すべての OpenAI 互換マルチモーダルモデルに対応 |
| Baidu Qianfan | メインモデル | 多モーダルの主モデル(`ernie-5.0` など)は直接画像を処理。テキスト専用主モデルの場合は `ernie-4.5-turbo-vl` に自動フォールバック |
| 通義千問 (DashScope) | メインモデル | MultiModalConversation API 経由 |
| Claude | メインモデル | Anthropic ネイティブ画像形式 |
| Gemini | メインモデル | inlineData 形式 |
@@ -52,7 +53,7 @@ Vision ツールで使用するモデルを指定するには、`config.json`
{
"tool": {
"vision": {
"model": "gpt-4o"
"model": "ernie-4.5-turbo-vl"
}
}
}

View File

@@ -6,7 +6,7 @@ description: CowAgent 支持的模型及推荐选择
CowAgent 支持国内外主流厂商的大语言模型,模型接口实现在项目的 `models/` 目录下。
<Note>
Agent 模式下推荐使用以下模型可根据效果及成本综合选择deepseek-v4-flash、MiniMax-M2.7、claude-sonnet-4-6、gemini-3.1-pro-preview、glm-5.1、qwen3.6-plus、kimi-k2.6
Agent 模式下推荐使用以下模型可根据效果及成本综合选择deepseek-v4-flash、MiniMax-M2.7、claude-sonnet-4-6、gemini-3.1-pro-preview、glm-5.1、qwen3.6-plus、kimi-k2.6、ernie-5.0
同时支持使用 [LinkAI](https://link-ai.tech) 平台接口,可灵活切换多种模型,并支持知识库、工作流、插件等 Agent 能力。
</Note>
@@ -26,6 +26,9 @@ CowAgent 支持国内外主流厂商的大语言模型,模型接口实现在
<Card title="DeepSeek" href="/models/deepseek">
deepseek-v4-flash、deepseek-v4-pro 等
</Card>
<Card title="百度千帆 / ERNIE" href="/models/qianfan">
ernie-5.0、ernie-4.5-turbo-128k 等
</Card>
<Card title="MiniMax" href="/models/minimax">
MiniMax-M2.7 等系列模型
</Card>

63
docs/models/qianfan.mdx Normal file
View File

@@ -0,0 +1,63 @@
---
title: 百度千帆
description: 百度千帆 ERNIE 模型配置
---
方式一:官方接入(推荐):
```json
{
"model": "ernie-5.0",
"qianfan_api_key": "",
"qianfan_api_base": "https://qianfan.baidubce.com/v2"
}
```
| 参数 | 说明 |
| --- | --- |
| `model` | 默认推荐使用 `ernie-5.0`;也可使用 `ernie-x1.1`、`ernie-4.5-turbo-128k`、`ernie-4.5-turbo-32k` |
| `qianfan_api_key` | 千帆 API Key格式通常以 `bce-v3/` 开头 |
| `qianfan_api_base` | 可选,默认为 `https://qianfan.baidubce.com/v2` |
## 模型选择
| 模型 | 适用场景 |
| --- | --- |
| `ernie-5.0` | 默认推荐,文心新一代旗舰模型,综合能力最强 |
| `ernie-x1.1` | 深度思考推理模型,幻觉更低、指令遵循与工具调用更强 |
| `ernie-4.5-turbo-128k` | 长上下文和通用对话 |
| `ernie-4.5-turbo-32k` | 通用对话,成本和上下文更均衡 |
## Vision 工具
配置 `qianfan_api_key` 后Agent 的 Vision 工具可以自动使用千帆视觉模型:
- 当主模型本身是多模态时(如 `ernie-5.0`、`ernie-x1.1`、`ernie-4.5-turbo-vl`),直接由主模型识别图像,无需额外配置
- 当主模型是纯文本时(如 `ernie-4.5-turbo-128k`Vision 工具会自动 fallback 到 `ernie-4.5-turbo-vl`
如需手动指定 Vision 模型,可在 `config.json` 中显式配置:
```json
{
"tool": {
"vision": {
"model": "ernie-4.5-turbo-vl"
}
}
}
```
方式二OpenAI 兼容方式接入:
```json
{
"model": "ernie-5.0",
"bot_type": "openai",
"open_ai_api_key": "",
"open_ai_api_base": "https://qianfan.baidubce.com/v2"
}
```
<Tip>
新配置推荐使用 `qianfan_api_key`。旧的 `wenxin`、`wenxin-4`、`baidu_wenxin_api_key`、`baidu_wenxin_secret_key` 配置仍保持兼容。
</Tip>

View File

@@ -5,6 +5,7 @@ description: CowAgent 版本更新历史
| 版本 | 日期 | 说明 |
| --- | --- | --- |
| [2.0.8](/releases/v2.0.8) | 2026.05.06 | 飞书渠道全面升级语音、流式输出和Markdown、扫码一键接入、DeepSeek V4和百度模型新增、定时任务工具增强 |
| [2.0.7](/releases/v2.0.7) | 2026.04.22 | 图像生成技能六厂商自动路由、新模型支持Kimi K2.6、Claude Opus 4.7、GLM 5.1、知识库增强、Web 控制台优化 |
| [2.0.6](/releases/v2.0.6) | 2026.04.14 | 项目更名、知识库系统、梦境记忆蒸馏、上下文智能压缩、Web 控制台多会话及多项优化 |
| [2.0.5](/releases/v2.0.5) | 2026.04.01 | Cow CLI、Skill Hub 开源、浏览器工具、企微扫码创建、多项优化和修复 |

63
docs/releases/v2.0.8.mdx Normal file
View File

@@ -0,0 +1,63 @@
---
title: v2.0.8
description: CowAgent 2.0.8 - 飞书渠道全面升级语音、流式打字机、一键扫码接入、DeepSeek V4 / 百度千帆支持、定时任务工具优化
---
## 🪶 飞书渠道全面升级
### 1. 一键扫码创建飞书应用
不再需要手动到飞书开放平台建应用、填权限和事件订阅。Web 控制台和命令行启动时若未配置 `feishu_app_id`,会自动展示扫码入口,飞书扫码授权后自动创建机器人并回填配置,开箱即用。
相关文档:[飞书渠道](https://docs.cowagent.ai/channels/feishu)
### 2. 语音消息收发
支持接收用户发送的飞书语音消息并自动转文本,回复也可走 TTS 以语音形式发出。同时优化了中文短语音的识别准确度。
### 3. 流式打字机回复
接入飞书 CardKit 流式卡片,**默认开启**,体验对齐 Web 端:
- 多轮 Agent 场景下中间过场消息与最终回复分卡呈现
- 针对 DeepSeek 等高频输出模型做了专门优化,速度与 Web 端持平
- 不支持时自动回退为普通文本回复,无需手动配置
- 要求飞书客户端 ≥ 7.20
飞书语音消息收发与流式打字机的基础能力来自社区贡献 #2791 Thanks @ooaaooaa123
## 🤖 新模型支持
- **DeepSeek V4 系列**:新增 `deepseek-v4-pro` / `deepseek-v4-flash`,并将默认模型切换为 `deepseek-v4-flash`
- **思考模型开关统一**DeepSeek V4、Qwen3 等思考模型的开关行为对齐到 `enable_thinking`
- **百度千帆模型接入**:新增百度千帆厂商,支持 `ernie-5.0`、`ernie-4.5-turbo-128k` 等模型,并支持图像识别工具,相关文档查看 [百度千帆](https://docs.cowagent.ai/models/qianfan)。#2790 Thanks @jimmyzhuu
- **新增有道翻译**`translate` 模块新增有道翻译支持 #2797 Thanks @Zmjjeff7
## 🛠 OpenAI 客户端重构
- **去 SDK 依赖**OpenAI Bot 改为原生 HTTP 实现,启动更轻、依赖冲突更少
- **Web 控制台提示**:模型配置 API Base 输入框加入版本路径占位提示
## ⏰ 定时任务记忆增强
- **任务结果可被追问**:定时任务的执行结果自动注入到接收方的会话历史中,下一轮对话可直接追问,无需重新交代上下文 Thanks @huangrichao2020
- **不污染长期记忆**:注入的调度对话不会被纳入每日梦境记忆汇总,避免高频任务把记忆刷满
- **避免越跑越慢**:调度任务自己的上下文长度自动控制在合理范围内,长期反复执行也不会越积越大、拖慢响应
## 🔧 工具与安全
- **图像识别模型**:让 `tool.vision.model` 配置真正生效,未配置时自动 fallback #2792 Thanks CNXudiandian
- **Bash 安全确认**:仅对工作区外的破坏性删除做二次确认,工作区内常规操作不再打扰
## 🐛 其他修复
- 修复 Deep Dream 在多实例场景下重复触发
- 修复 DeepSeek 多轮对话中部分历史轮次缺失 `reasoning_content`
## 📦 升级方式
源码部署可执行 `cow update` 或 `./run.sh update` 一键升级,或手动拉取代码后重启。详见 [更新升级文档](https://docs.cowagent.ai/guide/upgrade)。
> ⚠️ 飞书一键创建应用依赖 `lark-oapi>=1.5.5``cow update` 会自动拉取;手动部署请确保依赖已更新。
**发布日期**2026.05.06 | [Full Changelog](https://github.com/zhayujie/CowAgent/compare/2.0.7...2.0.8)

View File

@@ -38,3 +38,43 @@ description: 创建和管理定时任务
<Frame>
<img src="https://cdn.link-ai.tech/doc/20260202195402.png" width="800" />
</Frame>
## 结果进入会话上下文
定时任务在隔离 session 中执行(内部规划与 tool 调用不污染用户会话),但**最终输出**会作为一对消息回写到接收者的真实会话,用户可以直接追问"刚才那条第二点展开说说"。
**默认策略**
- Agent 动态任务的输出进入上下文
- 固定消息类任务默认不进入上下文(可通过配置打开)
- 每个会话最多保留最近 **3 对** scheduler 消息,更早的自动清理;普通用户消息不受影响
**配置项**
| 配置项 | 默认值 | 说明 |
| --- | --- | --- |
| `scheduler_inject_to_session` | `true` | 总开关 |
| `scheduler_inject_max_per_session` | `3` | 每会话保留 scheduler 消息对数上限 |
| `scheduler_inject_send_message` | `false` | 是否同时注入固定消息类任务 |
```json
{
"scheduler_inject_to_session": true,
"scheduler_inject_max_per_session": 3,
"scheduler_inject_send_message": false
}
```
## 任务执行时的上下文
定时任务的隔离 session 会保留最近几次执行的对话历史,便于做"对比上次"、"延续之前结论"等操作;但为了避免高频任务(如每 5 分钟监控prompt 越积越长,会按公式自动裁剪:
```
scheduler_keep_turns = max(1, agent_max_context_turns / 5)
```
`agent_max_context_turns` 默认为 `20`,所以定时任务每次执行默认带最近 **4 轮**历史。需要更长记忆可调大 `agent_max_context_turns`。
<Note>
群聊场景(飞书 / 企微群机器人 / 钉钉等)下用户的真实 session_id 形如 `user_id:group_id`,与 receiver 不同。创建任务时会自动记录正确的 session_id老的 `tasks.json` 缺该字段时回落到 receiver行为与历史版本一致。
</Note>

View File

@@ -24,6 +24,7 @@ Vision 工具采用多级自动选择 + 自动兜底策略,无需手动配置
| Gemini | 使用主模型 | inlineData 格式 |
| 豆包 (Doubao) | 使用主模型 | doubao-seed-2-0 系列原生支持 |
| Kimi (Moonshot) | 使用主模型 | kimi-k2.6、kimi-k2.5 原生支持 |
| 百度千帆 (Qianfan) | 使用主模型 | 默认使用多模态主模型 (如 ernie-5.0),主模型不支持时兜底使用 ernie-4.5-turbo-vl |
| 智谱 AI | glm-5v-turbo | 固定使用视觉专用模型 |
| MiniMax | MiniMax-Text-01 | 固定使用视觉专用模型 |
@@ -41,12 +42,14 @@ Vision 工具采用多级自动选择 + 自动兜底策略,无需手动配置
{
"tool": {
"vision": {
"model": "gpt-4o"
"model": "gpt-4.1"
}
}
}
```
指定的模型会被**优先使用**,工具会根据模型名自动路由到对应的 provider若调用失败会自动 fallback 到其他已配置的 provider。
大多数情况下无需配置,主模型支持多模态或配置任意一个支持视觉的 API Key 即可自动工作。
## 参数

View File

@@ -21,6 +21,10 @@ def create_bot(bot_type):
from models.deepseek.deepseek_bot import DeepSeekBot
return DeepSeekBot()
elif bot_type == const.QIANFAN:
from models.qianfan.qianfan_bot import QianfanBot
return QianfanBot()
elif bot_type in (const.OPENAI, const.CHATGPT, const.CUSTOM): # OpenAI-compatible API
from models.chatgpt.chat_gpt_bot import ChatGPTBot
return ChatGPTBot()

View File

@@ -3,8 +3,15 @@
import time
import json
import openai
from models.openai.openai_compat import error as openai_error, RateLimitError, Timeout, APIError, APIConnectionError
from models.openai.openai_compat import (
error as openai_error,
RateLimitError,
Timeout,
APIError,
APIConnectionError,
wrap_http_error,
)
from models.openai.openai_http_client import OpenAIHTTPClient, OpenAIHTTPError
import requests
from common import const
from models.bot import Bot
@@ -23,18 +30,19 @@ from models.baidu.baidu_wenxin_session import BaiduWenxinSession
class ChatGPTBot(Bot, OpenAIImage, OpenAICompatibleBot):
def __init__(self):
super().__init__()
# set the default api_key / api_base based on bot_type
# Resolve api key / base from config (no global SDK state anymore).
if conf().get("bot_type") == "custom":
openai.api_key = conf().get("custom_api_key", "")
if conf().get("custom_api_base"):
openai.api_base = conf().get("custom_api_base")
self._api_key = conf().get("custom_api_key", "")
self._api_base = conf().get("custom_api_base") or None
else:
openai.api_key = conf().get("open_ai_api_key")
if conf().get("open_ai_api_base"):
openai.api_base = conf().get("open_ai_api_base")
proxy = conf().get("proxy")
if proxy:
openai.proxy = proxy
self._api_key = conf().get("open_ai_api_key")
self._api_base = conf().get("open_ai_api_base") or None
self._proxy = conf().get("proxy") or None
self._http_client = OpenAIHTTPClient(
api_key=self._api_key,
api_base=self._api_base,
proxy=self._proxy,
)
if conf().get("rate_limit_chatgpt"):
self.tb4chatgpt = TokenBucket(conf().get("rate_limit_chatgpt", 20))
conf_model = conf().get("model") or "gpt-3.5-turbo"
@@ -71,6 +79,10 @@ class ChatGPTBot(Bot, OpenAIImage, OpenAICompatibleBot):
'default_frequency_penalty': conf().get("frequency_penalty", 0.0),
'default_presence_penalty': conf().get("presence_penalty", 0.0),
}
def _get_http_client(self) -> OpenAIHTTPClient:
"""Override the default HTTP client to reuse our pre-configured one."""
return self._http_client
def reply(self, query, context=None):
# acquire reply content
@@ -195,20 +207,16 @@ class ChatGPTBot(Bot, OpenAIImage, OpenAICompatibleBot):
logger.info(f"[CHATGPT] Calling vision API with model: {model}")
# Call OpenAI API
kwargs = {
"model": model,
"messages": messages,
"max_tokens": 1000
}
if api_key:
kwargs["api_key"] = api_key
if api_base:
kwargs["api_base"] = api_base
response = openai.ChatCompletion.create(**kwargs)
content = response.choices[0]["message"]["content"]
# Call OpenAI-compatible API via HTTP
response = self._http_client.chat_completions(
api_key=api_key or None,
api_base=api_base or None,
model=model,
messages=messages,
max_tokens=1000,
)
content = response["choices"][0]["message"]["content"]
logger.info(f"[CHATGPT] Vision API response: {content[:100]}...")
# Clean up temp file
@@ -237,57 +245,100 @@ class ChatGPTBot(Bot, OpenAIImage, OpenAICompatibleBot):
try:
if conf().get("rate_limit_chatgpt") and not self.tb4chatgpt.get_token():
raise RateLimitError("RateLimitError: rate limit exceeded")
# if api_key == None, the default openai.api_key will be used
# If api_key is None, the per-instance default key will be used.
if args is None:
args = self.args
response = openai.ChatCompletion.create(api_key=api_key, messages=session.messages, **args)
# logger.debug("[CHATGPT] response={}".format(response))
logger.info("[ChatGPT] reply={}, total_tokens={}".format(response.choices[0]['message']['content'], response["usage"]["total_tokens"]))
# Translate old SDK kwargs to HTTP client params:
# - request_timeout / timeout -> per-call timeout
call_args = dict(args)
timeout = call_args.pop("request_timeout", None) or call_args.pop("timeout", None)
response = self._http_client.chat_completions(
api_key=api_key or None,
timeout=timeout,
messages=session.messages,
**call_args,
)
logger.info("[ChatGPT] reply={}, total_tokens={}".format(
response["choices"][0]["message"]["content"],
response["usage"]["total_tokens"]
))
return {
"total_tokens": response["usage"]["total_tokens"],
"completion_tokens": response["usage"]["completion_tokens"],
"content": response.choices[0]["message"]["content"],
"content": response["choices"][0]["message"]["content"],
}
except OpenAIHTTPError as http_err:
return self._handle_reply_error(
wrap_http_error(http_err), session, api_key, args, retry_count
)
except Exception as e:
need_retry = retry_count < 2
result = {"completion_tokens": 0, "content": "我现在有点累了,等会再来吧"}
if isinstance(e, RateLimitError):
logger.warn("[CHATGPT] RateLimitError: {}".format(e))
result["content"] = "提问太快啦,请休息一下再问我吧"
if need_retry:
time.sleep(20)
elif isinstance(e, Timeout):
logger.warn("[CHATGPT] Timeout: {}".format(e))
result["content"] = "我没有收到你的消息"
if need_retry:
time.sleep(5)
elif isinstance(e, APIError):
logger.warn("[CHATGPT] Bad Gateway: {}".format(e))
result["content"] = "请再问我一次"
if need_retry:
time.sleep(10)
elif isinstance(e, APIConnectionError):
logger.warn("[CHATGPT] APIConnectionError: {}".format(e))
result["content"] = "我连接不到你的网络"
if need_retry:
time.sleep(5)
else:
logger.exception("[CHATGPT] Exception: {}".format(e))
need_retry = False
self.sessions.clear_session(session.session_id)
return self._handle_reply_error(e, session, api_key, args, retry_count)
def _handle_reply_error(self, e, session, api_key, args, retry_count):
"""Map exception to user-facing reply with retry/backoff (mirrors SDK behavior)."""
need_retry = retry_count < 2
result = {"completion_tokens": 0, "content": "我现在有点累了,等会再来吧"}
if isinstance(e, RateLimitError):
logger.warn("[CHATGPT] RateLimitError: {}".format(e))
result["content"] = "提问太快啦,请休息一下再问我吧"
if need_retry:
logger.warn("[CHATGPT] 第{}次重试".format(retry_count + 1))
return self.reply_text(session, api_key, args, retry_count + 1)
else:
return result
time.sleep(20)
elif isinstance(e, Timeout):
logger.warn("[CHATGPT] Timeout: {}".format(e))
result["content"] = "我没有收到你的消息"
if need_retry:
time.sleep(5)
elif isinstance(e, APIConnectionError):
logger.warn("[CHATGPT] APIConnectionError: {}".format(e))
result["content"] = "我连接不到你的网络"
if need_retry:
time.sleep(5)
elif isinstance(e, APIError):
logger.warn("[CHATGPT] Bad Gateway: {}".format(e))
result["content"] = "请再问我一次"
if need_retry:
time.sleep(10)
else:
logger.exception("[CHATGPT] Exception: {}".format(e))
need_retry = False
self.sessions.clear_session(session.session_id)
if need_retry:
logger.warn("[CHATGPT] 第{}次重试".format(retry_count + 1))
return self.reply_text(session, api_key, args, retry_count + 1)
return result
class AzureChatGPTBot(ChatGPTBot):
"""Azure OpenAI variant.
Azure's HTTP shape differs from public OpenAI:
URL : {endpoint}/openai/deployments/{deployment}/chat/completions
Auth : api-key header (not Bearer)
Query : ?api-version={version}
We model that with a dedicated HTTP client and override _get_http_client
so the OpenAICompatibleBot streaming/tool path uses it transparently.
"""
def __init__(self):
super().__init__()
openai.api_type = "azure"
openai.api_version = conf().get("azure_api_version", "2023-06-01-preview")
self.args["deployment_id"] = conf().get("azure_deployment_id")
self._azure_api_version = conf().get("azure_api_version", "2023-06-01-preview")
self._azure_deployment_id = conf().get("azure_deployment_id")
# Drop legacy SDK kwarg; Azure deployment is encoded in the URL now.
self.args.pop("deployment_id", None)
endpoint = (self._api_base or "").rstrip("/")
deployment = self._azure_deployment_id or ""
# Build a base that already includes /openai/deployments/{deployment}.
# /chat/completions will be appended by the client.
azure_base = (
f"{endpoint}/openai/deployments/{deployment}" if endpoint and deployment else endpoint
)
self._http_client = _AzureChatHTTPClient(
api_key=self._api_key,
api_base=azure_base,
api_version=self._azure_api_version,
proxy=self._proxy,
)
def create_img(self, query, retry_count=0, api_key=None):
text_to_image_model = conf().get("text_to_image")
@@ -357,3 +408,35 @@ class AzureChatGPTBot(ChatGPTBot):
return False, "图片生成失败"
else:
return False, "图片生成失败未配置text_to_image参数"
class _AzureChatHTTPClient(OpenAIHTTPClient):
"""Subclass that injects Azure's ``api-version`` query param and ``api-key``
header on every chat-completion request, and accepts the deployment-scoped
base URL set by :class:`AzureChatGPTBot`.
"""
def __init__(self, api_key, api_base, api_version, proxy=None, timeout=None):
super().__init__(
api_key=api_key, api_base=api_base, proxy=proxy, timeout=timeout
)
self._api_version = api_version
def _build_headers(self, api_key, extra_headers):
# Azure uses api-key header, not Bearer token.
key = api_key if api_key is not None else self.api_key
headers = {"Content-Type": "application/json"}
if key:
headers["api-key"] = key
if self.extra_headers:
headers.update(self.extra_headers)
if extra_headers:
headers.update(extra_headers)
return headers
def chat_completions(self, **kwargs):
# Always force api-version query param for Azure.
eq = dict(kwargs.get("extra_query") or {})
eq.setdefault("api-version", self._api_version)
kwargs["extra_query"] = eq
return super().chat_completions(**kwargs)

View File

@@ -2,8 +2,14 @@
import time
import openai
from models.openai.openai_compat import RateLimitError, Timeout, APIConnectionError
from models.openai.openai_compat import (
RateLimitError,
Timeout,
APIConnectionError,
APIError,
wrap_http_error,
)
from models.openai.openai_http_client import OpenAIHTTPClient, OpenAIHTTPError
from models.bot import Bot
from models.openai_compatible_bot import OpenAICompatibleBot
@@ -22,12 +28,14 @@ user_session = dict()
class OpenAIBot(Bot, OpenAIImage, OpenAICompatibleBot):
def __init__(self):
super().__init__()
openai.api_key = conf().get("open_ai_api_key")
if conf().get("open_ai_api_base"):
openai.api_base = conf().get("open_ai_api_base")
proxy = conf().get("proxy")
if proxy:
openai.proxy = proxy
self._api_key = conf().get("open_ai_api_key")
self._api_base = conf().get("open_ai_api_base") or None
self._proxy = conf().get("proxy") or None
self._http_client = OpenAIHTTPClient(
api_key=self._api_key,
api_base=self._api_base,
proxy=self._proxy,
)
self.sessions = SessionManager(OpenAISession, model=conf().get("model") or "text-davinci-003")
self.args = {
@@ -54,6 +62,10 @@ class OpenAIBot(Bot, OpenAIImage, OpenAICompatibleBot):
'default_presence_penalty': conf().get("presence_penalty", 0.0),
}
def _get_http_client(self) -> OpenAIHTTPClient:
"""Reuse the per-instance HTTP client for the streaming/tool path."""
return self._http_client
def reply(self, query, context=None):
# acquire reply content
if context and context.type:
@@ -96,8 +108,14 @@ class OpenAIBot(Bot, OpenAIImage, OpenAICompatibleBot):
def reply_text(self, session: OpenAISession, retry_count=0):
try:
response = openai.Completion.create(prompt=str(session), **self.args)
res_content = response.choices[0]["text"].strip().replace("<|endoftext|>", "")
call_args = dict(self.args)
timeout = call_args.pop("request_timeout", None) or call_args.pop("timeout", None)
response = self._http_client.completions(
timeout=timeout,
prompt=str(session),
**call_args,
)
res_content = response["choices"][0]["text"].strip().replace("<|endoftext|>", "")
total_tokens = response["usage"]["total_tokens"]
completion_tokens = response["usage"]["completion_tokens"]
logger.info("[OPEN_AI] reply={}".format(res_content))
@@ -106,125 +124,41 @@ class OpenAIBot(Bot, OpenAIImage, OpenAICompatibleBot):
"completion_tokens": completion_tokens,
"content": res_content,
}
except OpenAIHTTPError as http_err:
return self._handle_legacy_error(wrap_http_error(http_err), session, retry_count)
except Exception as e:
need_retry = retry_count < 2
result = {"completion_tokens": 0, "content": "我现在有点累了,等会再来吧"}
if isinstance(e, RateLimitError):
logger.warn("[OPEN_AI] RateLimitError: {}".format(e))
result["content"] = "提问太快啦,请休息一下再问我吧"
if need_retry:
time.sleep(20)
elif isinstance(e, Timeout):
logger.warn("[OPEN_AI] Timeout: {}".format(e))
result["content"] = "我没有收到你的消息"
if need_retry:
time.sleep(5)
elif isinstance(e, APIConnectionError):
logger.warn("[OPEN_AI] APIConnectionError: {}".format(e))
need_retry = False
result["content"] = "我连接不到你的网络"
else:
logger.warn("[OPEN_AI] Exception: {}".format(e))
need_retry = False
self.sessions.clear_session(session.session_id)
return self._handle_legacy_error(e, session, retry_count)
def _handle_legacy_error(self, e, session, retry_count):
"""Map exception -> reply for the legacy /completions endpoint."""
need_retry = retry_count < 2
result = {"completion_tokens": 0, "content": "我现在有点累了,等会再来吧"}
if isinstance(e, RateLimitError):
logger.warn("[OPEN_AI] RateLimitError: {}".format(e))
result["content"] = "提问太快啦,请休息一下再问我吧"
if need_retry:
logger.warn("[OPEN_AI] 第{}次重试".format(retry_count + 1))
return self.reply_text(session, retry_count + 1)
else:
return result
time.sleep(20)
elif isinstance(e, Timeout):
logger.warn("[OPEN_AI] Timeout: {}".format(e))
result["content"] = "我没有收到你的消息"
if need_retry:
time.sleep(5)
elif isinstance(e, APIConnectionError):
logger.warn("[OPEN_AI] APIConnectionError: {}".format(e))
need_retry = False
result["content"] = "我连接不到你的网络"
else:
logger.warn("[OPEN_AI] Exception: {}".format(e))
need_retry = False
self.sessions.clear_session(session.session_id)
def call_with_tools(self, messages, tools=None, stream=False, **kwargs):
"""
Call OpenAI API with tool support for agent integration
Note: This bot uses the old Completion API which doesn't support tools.
For tool support, use ChatGPTBot instead.
This method converts to ChatCompletion API when tools are provided.
Args:
messages: List of messages
tools: List of tool definitions (OpenAI format)
stream: Whether to use streaming
**kwargs: Additional parameters
Returns:
Formatted response in OpenAI format or generator for streaming
"""
try:
# The old Completion API doesn't support tools
# We need to use ChatCompletion API instead
logger.info("[OPEN_AI] Using ChatCompletion API for tool support")
# Build request parameters for ChatCompletion
request_params = {
"model": kwargs.get("model", conf().get("model") or "gpt-4.1"),
"messages": messages,
"temperature": kwargs.get("temperature", conf().get("temperature", 0.9)),
"top_p": kwargs.get("top_p", 1),
"frequency_penalty": kwargs.get("frequency_penalty", conf().get("frequency_penalty", 0.0)),
"presence_penalty": kwargs.get("presence_penalty", conf().get("presence_penalty", 0.0)),
"stream": stream
}
# Add max_tokens if specified
if kwargs.get("max_tokens"):
request_params["max_tokens"] = kwargs["max_tokens"]
# Add tools if provided
if tools:
request_params["tools"] = tools
request_params["tool_choice"] = kwargs.get("tool_choice", "auto")
# Make API call using ChatCompletion
if stream:
return self._handle_stream_response(request_params)
else:
return self._handle_sync_response(request_params)
except Exception as e:
logger.error(f"[OPEN_AI] call_with_tools error: {e}")
if stream:
def error_generator():
yield {
"error": True,
"message": str(e),
"status_code": 500
}
return error_generator()
else:
return {
"error": True,
"message": str(e),
"status_code": 500
}
def _handle_sync_response(self, request_params):
"""Handle synchronous OpenAI ChatCompletion API response"""
try:
response = openai.ChatCompletion.create(**request_params)
logger.info(f"[OPEN_AI] call_with_tools reply, model={response.get('model')}, "
f"total_tokens={response.get('usage', {}).get('total_tokens', 0)}")
return response
except Exception as e:
logger.error(f"[OPEN_AI] sync response error: {e}")
raise
def _handle_stream_response(self, request_params):
"""Handle streaming OpenAI ChatCompletion API response"""
try:
stream = openai.ChatCompletion.create(**request_params)
for chunk in stream:
yield chunk
except Exception as e:
logger.error(f"[OPEN_AI] stream response error: {e}")
yield {
"error": True,
"message": str(e),
"status_code": 500
}
if need_retry:
logger.warn("[OPEN_AI] 第{}次重试".format(retry_count + 1))
return self.reply_text(session, retry_count + 1)
return result
# NOTE: Tool-call routing is delegated to OpenAICompatibleBot.call_with_tools,
# which calls /chat/completions via our shared HTTP client. The previous
# bespoke implementation here bypassed Claude->OpenAI message/tool conversion
# and was effectively broken for agent flows; we now inherit the correct
# implementation from the base class.

View File

@@ -1,17 +1,25 @@
import time
import openai
from models.openai.openai_compat import RateLimitError
from common.log import logger
from common.token_bucket import TokenBucket
from config import conf
from models.openai.openai_compat import RateLimitError, wrap_http_error
from models.openai.openai_http_client import OpenAIHTTPClient, OpenAIHTTPError
# OPENAI提供的画图接口
# OpenAI image generation API wrapper
class OpenAIImage(object):
def __init__(self):
openai.api_key = conf().get("open_ai_api_key")
# Lazy default client; subclasses (ChatGPTBot/OpenAIBot) typically
# construct their own _http_client and override _get_image_client().
self._image_api_key = conf().get("open_ai_api_key")
self._image_api_base = conf().get("open_ai_api_base") or None
self._image_proxy = conf().get("proxy") or None
self._image_client = OpenAIHTTPClient(
api_key=self._image_api_key,
api_base=self._image_api_base,
proxy=self._image_proxy,
)
if conf().get("rate_limit_dalle"):
self.tb4dalle = TokenBucket(conf().get("rate_limit_dalle", 50))
@@ -20,24 +28,35 @@ class OpenAIImage(object):
if conf().get("rate_limit_dalle") and not self.tb4dalle.get_token():
return False, "请求太快了,请休息一下再问我吧"
logger.info("[OPEN_AI] image_query={}".format(query))
response = openai.Image.create(
api_key=api_key,
prompt=query, # 图片描述
n=1, # 每次生成图片的数量
response = self._image_client.images_generate(
api_key=api_key or None,
api_base=api_base or None,
prompt=query, # image description
n=1,
model=conf().get("text_to_image") or "dall-e-2",
# size=conf().get("image_create_size", "256x256"), # 图片大小,可选有 256x256, 512x512, 1024x1024
# size=conf().get("image_create_size", "256x256"),
)
image_url = response["data"][0]["url"]
logger.info("[OPEN_AI] image_url={}".format(image_url))
return True, image_url
except OpenAIHTTPError as http_err:
mapped = wrap_http_error(http_err)
if isinstance(mapped, RateLimitError):
logger.warn(mapped)
if retry_count < 1:
time.sleep(5)
logger.warn("[OPEN_AI] ImgCreate RateLimit exceed, 第{}次重试".format(retry_count + 1))
return self.create_img(query, retry_count + 1)
return False, "画图出现问题,请休息一下再问我吧"
logger.exception(mapped)
return False, "画图出现问题,请休息一下再问我吧"
except RateLimitError as e:
logger.warn(e)
if retry_count < 1:
time.sleep(5)
logger.warn("[OPEN_AI] ImgCreate RateLimit exceed, 第{}次重试".format(retry_count + 1))
return self.create_img(query, retry_count + 1)
else:
return False, "画图出现问题,请休息一下再问我吧"
return False, "画图出现问题,请休息一下再问我吧"
except Exception as e:
logger.exception(e)
return False, "画图出现问题,请休息一下再问我吧"

View File

@@ -1,102 +1,163 @@
"""
OpenAI compatibility layer for different versions.
OpenAI-compatible exception layer.
This module provides a compatibility layer between OpenAI library versions:
- OpenAI < 1.0 (old API with openai.error module)
- OpenAI >= 1.0 (new API with direct exception imports)
This module used to bridge between openai SDK 0.x and 1.x exception types.
Since we no longer depend on the `openai` SDK at all (we call HTTP directly
via :mod:`models.openai.openai_http_client`), this file now provides:
1. Pure Python exception classes that match the *names* the rest of the
codebase already imports (RateLimitError / Timeout / APIError /
APIConnectionError / AuthenticationError / InvalidRequestError ...).
2. A :func:`map_http_error` helper that converts an
:class:`OpenAIHTTPError` (or any HTTP status code + message) into the
appropriate exception subclass, so existing ``except RateLimitError``
``except Timeout`` etc. blocks keep working unchanged.
This keeps the behavior of all existing bots (rate-limit backoff, timeout
retry, auth-error fast-fail) identical to the openai-SDK-based version, while
removing the hard dependency on the `openai` package.
"""
try:
# Try new OpenAI >= 1.0 API
from openai import (
OpenAIError,
RateLimitError,
APIError,
APIConnectionError,
AuthenticationError,
APITimeoutError,
BadRequestError,
)
# Create a mock error module for backward compatibility
class ErrorModule:
OpenAIError = OpenAIError
RateLimitError = RateLimitError
APIError = APIError
APIConnectionError = APIConnectionError
AuthenticationError = AuthenticationError
Timeout = APITimeoutError # Renamed in new version
InvalidRequestError = BadRequestError # Renamed in new version
error = ErrorModule()
# Also export with new names
Timeout = APITimeoutError
InvalidRequestError = BadRequestError
except ImportError:
# Fall back to old OpenAI < 1.0 API
try:
import openai.error as error
# Export individual exceptions for direct import
OpenAIError = error.OpenAIError
RateLimitError = error.RateLimitError
APIError = error.APIError
APIConnectionError = error.APIConnectionError
AuthenticationError = error.AuthenticationError
InvalidRequestError = error.InvalidRequestError
Timeout = error.Timeout
BadRequestError = error.InvalidRequestError # Alias
APITimeoutError = error.Timeout # Alias
except (ImportError, AttributeError):
# Neither version works, create dummy classes
class OpenAIError(Exception):
pass
class RateLimitError(OpenAIError):
pass
class APIError(OpenAIError):
pass
class APIConnectionError(OpenAIError):
pass
class AuthenticationError(OpenAIError):
pass
class InvalidRequestError(OpenAIError):
pass
class Timeout(OpenAIError):
pass
BadRequestError = InvalidRequestError
APITimeoutError = Timeout
# Create error module
class ErrorModule:
OpenAIError = OpenAIError
RateLimitError = RateLimitError
APIError = APIError
APIConnectionError = APIConnectionError
AuthenticationError = AuthenticationError
InvalidRequestError = InvalidRequestError
Timeout = Timeout
error = ErrorModule()
from typing import Optional
# --------------------------------------------------------------------------- #
# Exception hierarchy (mirrors openai SDK names so call sites don't change)
# --------------------------------------------------------------------------- #
class OpenAIError(Exception):
"""Base exception for all OpenAI-compatible API errors."""
def __init__(self, message: str = "", status_code: Optional[int] = None,
body=None):
super().__init__(message)
self.message = message
self.status_code = status_code
self.body = body
class APIError(OpenAIError):
"""Generic API error (5xx and unclassified errors)."""
class APIConnectionError(OpenAIError):
"""Network / connection failure (DNS, refused, reset...)."""
class Timeout(OpenAIError):
"""Request timeout. Aliased as APITimeoutError for new-SDK style imports."""
class AuthenticationError(OpenAIError):
"""401 Unauthorized."""
class PermissionDeniedError(OpenAIError):
"""403 Forbidden."""
class NotFoundError(OpenAIError):
"""404 Not Found."""
class InvalidRequestError(OpenAIError):
"""400 Bad Request. Aliased as BadRequestError."""
class RateLimitError(OpenAIError):
"""429 Too Many Requests."""
# Aliases used by some new-SDK-style code paths in the project.
APITimeoutError = Timeout
BadRequestError = InvalidRequestError
# --------------------------------------------------------------------------- #
# Backward-compat ``error`` module-style accessor
# --------------------------------------------------------------------------- #
# Some legacy code in the codebase (and possibly user plugins) does
# from models.openai.openai_compat import error
# except error.RateLimitError: ...
# Keep that path working by exposing an attribute namespace.
class _ErrorModule:
OpenAIError = OpenAIError
APIError = APIError
APIConnectionError = APIConnectionError
Timeout = Timeout
AuthenticationError = AuthenticationError
PermissionDeniedError = PermissionDeniedError
NotFoundError = NotFoundError
InvalidRequestError = InvalidRequestError
RateLimitError = RateLimitError
error = _ErrorModule()
# --------------------------------------------------------------------------- #
# HTTP -> exception mapping
# --------------------------------------------------------------------------- #
def map_http_error(status_code: Optional[int], message: str = "",
body=None) -> OpenAIError:
"""Convert an HTTP status (+ optional message/body) to the right subclass.
Used by HTTP-based bot wrappers so that downstream ``except RateLimitError``
blocks behave identically to when the openai SDK was raising them.
"""
sc = status_code or 0
msg = message or ""
msg_lower = msg.lower()
# Connection-level (no status / non-HTTP failure)
if sc == 0:
if "timeout" in msg_lower or "timed out" in msg_lower:
return Timeout(msg, sc, body)
return APIConnectionError(msg, sc, body)
if sc == 408:
return Timeout(msg, sc, body)
if sc == 401:
return AuthenticationError(msg, sc, body)
if sc == 403:
return PermissionDeniedError(msg, sc, body)
if sc == 404:
return NotFoundError(msg, sc, body)
if sc == 429:
return RateLimitError(msg, sc, body)
if 400 <= sc < 500:
return InvalidRequestError(msg, sc, body)
if sc >= 500:
return APIError(msg, sc, body)
return APIError(msg, sc, body)
def wrap_http_error(http_err) -> OpenAIError:
"""Adapter for :class:`OpenAIHTTPError` -> compat exception subclass.
Accepts any object with ``status_code`` / ``message`` / ``body`` attrs.
"""
sc = getattr(http_err, "status_code", None)
msg = getattr(http_err, "message", "") or str(http_err)
body = getattr(http_err, "body", None)
return map_http_error(sc, msg, body)
# Export all for easy import
__all__ = [
'error',
'OpenAIError',
'RateLimitError',
'APIError',
'APIConnectionError',
'AuthenticationError',
'InvalidRequestError',
'Timeout',
'BadRequestError',
'APITimeoutError',
"error",
"OpenAIError",
"APIError",
"APIConnectionError",
"Timeout",
"APITimeoutError",
"AuthenticationError",
"PermissionDeniedError",
"NotFoundError",
"InvalidRequestError",
"BadRequestError",
"RateLimitError",
"map_http_error",
"wrap_http_error",
]

View File

@@ -0,0 +1,456 @@
# encoding:utf-8
"""
Lightweight HTTP client for OpenAI-compatible APIs.
This client is a drop-in replacement for the parts of the `openai` SDK that this
project actually uses (chat completions, completions, image generation), so we
can drop the hard dependency on `openai==0.27.x`.
Design goals:
- Pure `requests` based (no httpx / pydantic / openai SDK dependency).
- Returns plain `dict` responses with the same shape OpenAI's HTTP API returns,
so existing code that does `response["choices"][0]["message"]["content"]` /
`response["usage"]["total_tokens"]` keeps working.
- Streaming yields plain `dict` chunks (parsed SSE `data:` JSON), matching the
shape that `agent/protocol/agent_stream.py` consumes:
chunk["choices"][0]["delta"]["content" | "tool_calls" | "reasoning_content"]
chunk["choices"][0]["finish_reason"]
Plus dict-style error chunks: {"error": True, "message": ..., "status_code": ...}
- Compatible with arbitrary OpenAI-compatible endpoints (LinkAI, Azure-style
proxies, DeepSeek, Moonshot, etc.) by allowing per-call api_key / api_base
override and trusting whatever path/payload shape the caller passes.
"""
import json
from typing import Any, Dict, Generator, Optional
import requests
from common.log import logger
DEFAULT_API_BASE = "https://api.openai.com/v1"
DEFAULT_TIMEOUT = 600 # seconds; matches old openai SDK default
class OpenAIHTTPError(Exception):
"""Raised for non-2xx responses. Carries status code + parsed body."""
def __init__(self, status_code: int, body: Any, message: str = ""):
self.status_code = status_code
self.body = body
# Try to extract human-readable message from OpenAI-style error envelope
if not message and isinstance(body, dict):
err = body.get("error") or {}
if isinstance(err, dict):
message = err.get("message") or ""
elif isinstance(err, str):
message = err
if not message:
message = str(body)[:500]
self.message = message
super().__init__(f"HTTP {status_code}: {message}")
class OpenAIHTTPClient:
"""Minimal HTTP client for OpenAI-compatible endpoints.
Per-instance defaults (api_key / api_base / proxy / timeout) can be
overridden on every call. Callers can also pass ``extra_headers`` for
Azure-style ``api-key`` headers or custom routing headers.
"""
def __init__(
self,
api_key: Optional[str] = None,
api_base: Optional[str] = None,
proxy: Optional[str] = None,
timeout: Optional[float] = None,
extra_headers: Optional[Dict[str, str]] = None,
):
self.api_key = api_key
self.api_base = (api_base or DEFAULT_API_BASE).rstrip("/")
self.timeout = timeout if timeout is not None else DEFAULT_TIMEOUT
self.proxies = (
{"http": proxy, "https": proxy} if proxy else None
)
self.extra_headers = dict(extra_headers) if extra_headers else {}
# ------------------------------------------------------------------ #
# Public API surface (mirrors what the old openai SDK provided)
# ------------------------------------------------------------------ #
def chat_completions(
self,
*,
api_key: Optional[str] = None,
api_base: Optional[str] = None,
timeout: Optional[float] = None,
proxy: Optional[str] = None,
extra_headers: Optional[Dict[str, str]] = None,
extra_query: Optional[Dict[str, str]] = None,
path: str = "/chat/completions",
stream: bool = False,
**payload,
):
"""POST /chat/completions.
When ``stream=True`` returns a generator yielding parsed SSE chunks
(plain ``dict``). On error during streaming, yields a single dict with
``{"error": True, ...}`` and stops, matching the contract expected by
``agent/protocol/agent_stream.py``.
"""
payload["stream"] = stream
return self._request(
path=path,
payload=payload,
api_key=api_key,
api_base=api_base,
timeout=timeout,
proxy=proxy,
extra_headers=extra_headers,
extra_query=extra_query,
stream=stream,
)
def completions(
self,
*,
api_key: Optional[str] = None,
api_base: Optional[str] = None,
timeout: Optional[float] = None,
**payload,
) -> Dict[str, Any]:
"""POST /completions (legacy text completion). Non-streaming only."""
payload.pop("stream", None)
return self._request(
path="/completions",
payload=payload,
api_key=api_key,
api_base=api_base,
timeout=timeout,
stream=False,
)
def images_generate(
self,
*,
api_key: Optional[str] = None,
api_base: Optional[str] = None,
timeout: Optional[float] = None,
**payload,
) -> Dict[str, Any]:
"""POST /images/generations."""
return self._request(
path="/images/generations",
payload=payload,
api_key=api_key,
api_base=api_base,
timeout=timeout,
stream=False,
)
# ------------------------------------------------------------------ #
# Internal helpers
# ------------------------------------------------------------------ #
def _build_headers(
self,
api_key: Optional[str],
extra_headers: Optional[Dict[str, str]],
) -> Dict[str, str]:
key = api_key if api_key is not None else self.api_key
headers = {"Content-Type": "application/json"}
if key:
headers["Authorization"] = f"Bearer {key}"
if self.extra_headers:
headers.update(self.extra_headers)
if extra_headers:
headers.update(extra_headers)
return headers
def _request(
self,
*,
path: str,
payload: Dict[str, Any],
api_key: Optional[str],
api_base: Optional[str],
timeout: Optional[float],
stream: bool,
proxy: Optional[str] = None,
extra_headers: Optional[Dict[str, str]] = None,
extra_query: Optional[Dict[str, str]] = None,
):
base = (api_base or self.api_base).rstrip("/") if api_base else self.api_base
url = f"{base}{path}" if path.startswith("/") else f"{base}/{path}"
headers = self._build_headers(api_key, extra_headers)
req_timeout = timeout if timeout is not None else self.timeout
proxies = (
{"http": proxy, "https": proxy} if proxy else self.proxies
)
# Drop None-valued keys; some providers reject explicit nulls.
clean_payload = {k: v for k, v in payload.items() if v is not None}
if stream:
# Return a generator. Errors during stream are yielded as a single
# error chunk so callers (agent_stream) can map them to their
# existing error-handling path without try/except around the loop.
return self._stream_chat(
url=url,
headers=headers,
payload=clean_payload,
proxies=proxies,
timeout=req_timeout,
params=extra_query,
)
try:
resp = requests.post(
url,
headers=headers,
json=clean_payload,
timeout=req_timeout,
proxies=proxies,
params=extra_query,
)
except requests.exceptions.Timeout as e:
raise OpenAIHTTPError(408, {}, f"Request timed out: {e}")
except requests.exceptions.ConnectionError as e:
raise OpenAIHTTPError(0, {}, f"Connection error: {e}")
except requests.exceptions.RequestException as e:
raise OpenAIHTTPError(0, {}, f"Request failed: {e}")
return self._parse_response(resp)
@staticmethod
def _parse_response(resp: requests.Response) -> Dict[str, Any]:
# Try JSON, fall back to text
try:
data = resp.json()
except ValueError:
data = {"raw": resp.text}
if resp.status_code >= 400:
raise OpenAIHTTPError(resp.status_code, data)
return data
def _stream_chat(
self,
*,
url: str,
headers: Dict[str, str],
payload: Dict[str, Any],
proxies: Optional[Dict[str, str]],
timeout: float,
params: Optional[Dict[str, str]] = None,
) -> Generator[Dict[str, Any], None, None]:
"""Stream SSE response and yield parsed JSON chunks.
Yields:
- Normal chunks: dict with ``choices[0].delta`` etc.
- Error chunks: ``{"error": True, "message": str, "status_code": int}``
followed by termination of the generator.
"""
try:
resp = requests.post(
url,
headers=headers,
json=payload,
timeout=timeout,
proxies=proxies,
stream=True,
params=params,
)
except requests.exceptions.Timeout as e:
yield self._make_error_chunk(408, f"Request timed out: {e}")
return
except requests.exceptions.ConnectionError as e:
yield self._make_error_chunk(0, f"Connection error: {e}")
return
except requests.exceptions.RequestException as e:
yield self._make_error_chunk(0, f"Request failed: {e}")
return
if resp.status_code >= 400:
# Read full body once for error reporting
try:
body = resp.json()
except ValueError:
body = {"raw": resp.text[:1000]}
err_msg = ""
err_code = ""
err_type = ""
if isinstance(body, dict):
err = body.get("error") or {}
if isinstance(err, dict):
err_msg = err.get("message") or ""
err_code = err.get("code") or ""
err_type = err.get("type") or ""
elif isinstance(err, str):
err_msg = err
if not err_msg:
err_msg = str(body)[:500]
yield {
"error": {
"message": err_msg,
"code": err_code,
"type": err_type,
},
# Top-level fields kept for backward compatibility with the
# error-shape that `_handle_stream_response` previously emitted.
"message": err_msg,
"status_code": resp.status_code,
}
return
# IMPORTANT: do NOT use `iter_lines(decode_unicode=True)`.
#
# `requests` decodes per-network-chunk using the response's declared
# encoding (often Latin-1 / ISO-8859-1 for SSE), which mangles UTF-8
# codepoints that straddle a chunk boundary. Some upstreams (Azure
# OpenAI proxies, Cloudflare-fronted gateways, ...) split TCP chunks
# aggressively in the middle of multibyte characters, producing
# garbled text and "skip malformed SSE chunk" errors.
#
# The fix is to read raw bytes, accumulate them until we have a
# complete SSE event (terminated by a blank line per the SSE spec:
# https://html.spec.whatwg.org/multipage/server-sent-events.html),
# and only THEN decode as UTF-8. This mirrors what the official
# openai SDK 1.x does in `openai/_streaming.py::SSEDecoder` (which
# itself is copied from httpx-sse).
try:
for sse_event in self._iter_sse_events(resp):
# `sse_event` is the joined `data:` payload as a str.
if sse_event == "[DONE]":
return
if not sse_event:
continue
try:
chunk = json.loads(sse_event)
except ValueError:
logger.debug(
f"[OpenAIHTTP] skip malformed SSE chunk: {sse_event[:200]}"
)
continue
yield chunk
except requests.exceptions.ChunkedEncodingError as e:
yield self._make_error_chunk(0, f"Stream interrupted: {e}")
except requests.exceptions.RequestException as e:
yield self._make_error_chunk(0, f"Stream error: {e}")
finally:
try:
resp.close()
except Exception:
pass
@staticmethod
def _iter_sse_events(resp: requests.Response) -> Generator[str, None, None]:
"""Decode an SSE byte stream into joined `data:` payloads.
Implements the subset of the SSE spec that OpenAI / OpenAI-compatible
endpoints actually use:
- Events are separated by blank lines (\\r\\r, \\n\\n, or \\r\\n\\r\\n).
- Within an event, multiple ``data:`` lines are concatenated with
"\\n" (per spec).
- ``event:``, ``id:``, ``retry:`` and comment lines (``:``) are
tolerated but not yielded — for chat-completion we only care
about the JSON payload in ``data:``.
- Bytes are buffered until a complete event boundary is seen so
UTF-8 codepoints split across TCP chunks decode correctly.
Yields each event's joined ``data`` string. The terminal sentinel
``[DONE]`` is yielded as a literal string so the caller can break.
"""
buf = b""
for raw in resp.iter_content(chunk_size=None, decode_unicode=False):
if not raw:
continue
buf += raw
# Find complete events (terminated by a blank line).
while True:
# Look for the earliest event terminator. SSE allows three
# forms; check all and pick the earliest match.
idx_nn = buf.find(b"\n\n")
idx_rr = buf.find(b"\r\r")
idx_rnrn = buf.find(b"\r\n\r\n")
candidates = [i for i in (idx_nn, idx_rr, idx_rnrn) if i != -1]
if not candidates:
break
# We need to know the length of the matched terminator to
# advance past it correctly.
end_pos = min(candidates)
if end_pos == idx_rnrn:
term_len = 4
else:
term_len = 2
event_bytes = buf[:end_pos]
buf = buf[end_pos + term_len:]
# Decode the full event as UTF-8. ``errors="replace"`` is a
# belt-and-suspenders safety net for truly malformed upstream
# bytes; it should never trigger for well-formed providers.
try:
event_text = event_bytes.decode("utf-8")
except UnicodeDecodeError:
event_text = event_bytes.decode("utf-8", errors="replace")
data_lines = []
for line in event_text.splitlines():
if not line or line.startswith(":"):
continue
field, _, value = line.partition(":")
# Per SSE spec, a single optional space after the colon
# is part of the framing, not the value.
if value.startswith(" "):
value = value[1:]
if field == "data":
data_lines.append(value)
# Other fields (event/id/retry) are intentionally ignored
# — chat-completion endpoints don't use them in a way we
# need for parsing.
if data_lines:
yield "\n".join(data_lines)
# Flush any trailing bytes the server forgot to terminate. This is
# rare but spec-allowed (some providers omit the final \n\n).
if buf.strip():
try:
event_text = buf.decode("utf-8")
except UnicodeDecodeError:
event_text = buf.decode("utf-8", errors="replace")
data_lines = []
for line in event_text.splitlines():
if not line or line.startswith(":"):
continue
field, _, value = line.partition(":")
if value.startswith(" "):
value = value[1:]
if field == "data":
data_lines.append(value)
if data_lines:
yield "\n".join(data_lines)
@staticmethod
def _make_error_chunk(status_code: int, message: str) -> Dict[str, Any]:
return {
"error": {"message": message, "code": "", "type": ""},
"message": message,
"status_code": status_code,
}
# A tiny helper for callers that just need a one-shot client without storing
# state. Keeps call sites cleaner than instantiating the class every time.
def get_default_client(
api_key: Optional[str] = None,
api_base: Optional[str] = None,
proxy: Optional[str] = None,
timeout: Optional[float] = None,
) -> OpenAIHTTPClient:
return OpenAIHTTPClient(
api_key=api_key, api_base=api_base, proxy=proxy, timeout=timeout
)

View File

@@ -8,11 +8,11 @@ This includes: OpenAI, LinkAI, Azure OpenAI, and many third-party providers.
"""
import json
import openai
import requests
from typing import Optional
from common.log import logger
from agent.protocol.message_utils import drop_orphaned_tool_results_openai
from models.openai.openai_http_client import OpenAIHTTPClient, OpenAIHTTPError
class OpenAICompatibleBot:
@@ -135,49 +135,87 @@ class OpenAICompatibleBot:
"status_code": 500
}
def _get_http_client(self) -> OpenAIHTTPClient:
"""Build an HTTP client honoring the global proxy config.
Subclasses can override this for custom auth headers (e.g. Azure's
``api-key`` header) by returning a pre-configured client.
"""
from config import conf
proxy = conf().get("proxy") or None
return OpenAIHTTPClient(proxy=proxy)
def _handle_sync_response(self, request_params, api_key, api_base):
"""Handle synchronous OpenAI API response"""
"""Handle synchronous chat-completion via HTTP."""
params = dict(request_params)
params.pop("stream", None)
# Translate legacy SDK timeout kwarg to our HTTP client kwarg.
timeout = params.pop("request_timeout", None) or params.pop("timeout", None)
try:
# Build kwargs with explicit API configuration
kwargs = dict(request_params)
if api_key:
kwargs["api_key"] = api_key
if api_base:
kwargs["api_base"] = api_base
response = openai.ChatCompletion.create(**kwargs)
return response
client = self._get_http_client()
return client.chat_completions(
api_key=api_key,
api_base=api_base,
timeout=timeout,
stream=False,
**params,
)
except OpenAIHTTPError as e:
logger.error(
f"[{self.__class__.__name__}] sync response error: "
f"HTTP {e.status_code}: {e.message}"
)
return {
"error": True,
"message": e.message,
"status_code": e.status_code or 500,
}
except Exception as e:
logger.error(f"[{self.__class__.__name__}] sync response error: {e}")
return {
"error": True,
"message": str(e),
"status_code": 500
"status_code": 500,
}
def _handle_stream_response(self, request_params, api_key, api_base):
"""Handle streaming OpenAI API response"""
"""Handle streaming chat-completion via HTTP (SSE).
Yields dict chunks in OpenAI's standard streaming shape:
{"choices": [{"delta": {...}, "finish_reason": ...}], ...}
On error, yields a single ``{"error": ..., "status_code": ...}`` chunk
— the same contract :mod:`agent.protocol.agent_stream` already handles.
"""
params = dict(request_params)
params.pop("stream", None)
timeout = params.pop("request_timeout", None) or params.pop("timeout", None)
try:
# Build kwargs with explicit API configuration
kwargs = dict(request_params)
if api_key:
kwargs["api_key"] = api_key
if api_base:
kwargs["api_base"] = api_base
stream = openai.ChatCompletion.create(**kwargs)
# Stream chunks to caller
client = self._get_http_client()
stream = client.chat_completions(
api_key=api_key,
api_base=api_base,
timeout=timeout,
stream=True,
**params,
)
for chunk in stream:
yield chunk
except OpenAIHTTPError as e:
logger.error(
f"[{self.__class__.__name__}] stream response error: "
f"HTTP {e.status_code}: {e.message}"
)
yield {
"error": True,
"message": e.message,
"status_code": e.status_code or 500,
}
except Exception as e:
logger.error(f"[{self.__class__.__name__}] stream response error: {e}")
yield {
"error": True,
"message": str(e),
"status_code": 500
"status_code": 500,
}
def _convert_tools_to_openai_format(self, tools):

View File

@@ -0,0 +1 @@
# encoding:utf-8

View File

@@ -0,0 +1,238 @@
# encoding:utf-8
import time
import requests
from bridge.context import ContextType
from bridge.reply import Reply, ReplyType
from common import const
from common.log import logger
from config import conf, load_config
from models.bot import Bot
from models.openai_compatible_bot import OpenAICompatibleBot
from models.session_manager import SessionManager
from .qianfan_session import QianfanSession
DEFAULT_API_BASE = "https://qianfan.baidubce.com/v2"
DEFAULT_MODEL = const.ERNIE_5
DEFAULT_VISION_MODEL = const.ERNIE_45_TURBO_VL
# Qianfan models that natively understand images. Other ERNIE variants
# are text-only and must not receive image payloads.
_VISION_CAPABLE_MODELS = {
const.ERNIE_5,
const.ERNIE_X1_1,
const.ERNIE_45_TURBO_VL,
const.ERNIE_45_TURBO_VL_32K,
}
class QianfanBot(Bot, OpenAICompatibleBot):
@property
def supports_vision(self) -> bool:
"""Whether the configured main model is multimodal."""
return (conf().get("model") or "").lower() in _VISION_CAPABLE_MODELS
def __init__(self):
super().__init__()
model = self._resolve_model()
self.sessions = SessionManager(QianfanSession, model=model)
self.args = {
"model": model,
"temperature": conf().get("temperature", 0.7),
"top_p": conf().get("top_p", 1.0),
"frequency_penalty": conf().get("frequency_penalty", 0.0),
"presence_penalty": conf().get("presence_penalty", 0.0),
}
def _resolve_model(self):
model = conf().get("model") or DEFAULT_MODEL
if model == const.QIANFAN:
return DEFAULT_MODEL
return model
@property
def api_key(self):
return conf().get("qianfan_api_key")
@property
def api_base(self):
url = conf().get("qianfan_api_base") or DEFAULT_API_BASE
url = url.rstrip("/")
suffix = "/chat/completions"
if url.endswith(suffix):
url = url[:-len(suffix)]
return url.rstrip("/")
def get_api_config(self):
return {
"api_key": self.api_key,
"api_base": self.api_base,
"model": self._resolve_model(),
"default_temperature": conf().get("temperature", 0.7),
"default_top_p": conf().get("top_p", 1.0),
"default_frequency_penalty": conf().get("frequency_penalty", 0.0),
"default_presence_penalty": conf().get("presence_penalty", 0.0),
}
def _build_headers(self):
return {
"Content-Type": "application/json",
"Authorization": "Bearer {}".format(self.api_key),
}
def reply(self, query, context=None):
if context.type == ContextType.TEXT:
logger.info("[QIANFAN] query={}".format(query))
session_id = context["session_id"]
reply = None
clear_memory_commands = conf().get("clear_memory_commands", ["#清除记忆"])
if query in clear_memory_commands:
self.sessions.clear_session(session_id)
reply = Reply(ReplyType.INFO, "记忆已清除")
elif query == "#清除所有":
self.sessions.clear_all_session()
reply = Reply(ReplyType.INFO, "所有人记忆已清除")
elif query == "#更新配置":
load_config()
reply = Reply(ReplyType.INFO, "配置已更新")
if reply:
return reply
session = self.sessions.session_query(query, session_id)
logger.debug("[QIANFAN] session query={}".format(session.messages))
reply_content = self.reply_text(session, args=self.args.copy())
logger.debug(
"[QIANFAN] new_query={}, session_id={}, reply_cont={}, completion_tokens={}".format(
session.messages,
session_id,
reply_content["content"],
reply_content["completion_tokens"],
)
)
if reply_content["completion_tokens"] == 0 and len(reply_content["content"]) > 0:
reply = Reply(ReplyType.ERROR, reply_content["content"])
elif reply_content["completion_tokens"] > 0:
self.sessions.session_reply(
reply_content["content"], session_id, reply_content["total_tokens"],
)
reply = Reply(ReplyType.TEXT, reply_content["content"])
else:
reply = Reply(ReplyType.ERROR, reply_content["content"])
logger.debug("[QIANFAN] reply {} used 0 tokens.".format(reply_content))
return reply
else:
reply = Reply(ReplyType.ERROR, "Bot不支持处理{}类型的消息".format(context.type))
return reply
def reply_text(self, session, args=None, retry_count=0):
try:
body = dict(args) if args else dict(self.args)
body["messages"] = session.messages
response = requests.post(
"{}/chat/completions".format(self.api_base),
headers=self._build_headers(),
json=body,
timeout=conf().get("request_timeout", 180),
)
if response.status_code == 200:
data = response.json()
return {
"total_tokens": data["usage"]["total_tokens"],
"completion_tokens": data["usage"]["completion_tokens"],
"content": data["choices"][0]["message"]["content"],
}
return self._error_result(response, session, args, retry_count)
except Exception as e:
logger.exception(e)
if retry_count < 2:
return self.reply_text(session, args, retry_count + 1)
return {"completion_tokens": 0, "content": "我现在有点累了,等会再来吧"}
def call_vision(self, image_url: str, question: str,
model: str = None, max_tokens: int = 1000) -> dict:
vision_model = model or DEFAULT_VISION_MODEL
payload = {
"model": vision_model,
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": question},
{"type": "image_url", "image_url": {"url": image_url}},
],
}
],
"max_tokens": max_tokens,
}
try:
response = requests.post(
"{}/chat/completions".format(self.api_base),
headers=self._build_headers(),
json=payload,
timeout=conf().get("request_timeout", 180),
)
if response.status_code != 200:
err = self._error_result(response, None)
return {
"error": True,
"message": err.get("content", "Qianfan vision request failed"),
}
data = response.json()
choices = data.get("choices", [])
content = choices[0].get("message", {}).get("content", "") if choices else ""
usage = data.get("usage", {}) or {}
return {
"content": content,
"model": data.get("model", vision_model),
"usage": {
"prompt_tokens": usage.get("prompt_tokens", 0),
"completion_tokens": usage.get("completion_tokens", 0),
"total_tokens": usage.get("total_tokens", 0),
},
}
except Exception as e:
logger.exception(e)
return {"error": True, "message": str(e)}
def _error_result(self, response, session, args=None, retry_count=0):
try:
body = response.json()
except ValueError:
body = {"raw": response.text}
error = body.get("error") if isinstance(body, dict) else None
if isinstance(error, dict):
message = error.get("message") or str(error)
elif error:
message = str(error)
elif isinstance(body, dict) and body.get("raw") is not None:
message = str(body.get("raw"))
else:
message = str(body)
logger.error(
"[QIANFAN] chat failed, status_code={}, msg={}".format(
response.status_code, message
)
)
if response.status_code >= 500 and retry_count < 2:
time.sleep(3)
return self.reply_text(session, args, retry_count + 1)
if response.status_code == 401:
content = "授权失败,请检查 Qianfan API Key 是否正确"
elif response.status_code == 429:
if retry_count < 2:
time.sleep(3)
return self.reply_text(session, args, retry_count + 1)
content = "请求过于频繁,请稍后再试"
else:
content = "请求失败:{}".format(message)
return {"completion_tokens": 0, "content": content}

View File

@@ -0,0 +1,57 @@
from models.session_manager import Session
from common.log import logger
class QianfanSession(Session):
def __init__(self, session_id, system_prompt=None, model="ernie-5.0"):
super().__init__(session_id, system_prompt)
self.model = model
self.reset()
def discard_exceeding(self, max_tokens, cur_tokens=None):
precise = True
try:
cur_tokens = self.calc_tokens()
except Exception as e:
precise = False
if cur_tokens is None:
raise e
logger.debug("Exception when counting tokens precisely for query: {}".format(e))
while cur_tokens > max_tokens:
if len(self.messages) > 2:
self.messages.pop(1)
elif len(self.messages) == 2 and self.messages[1]["role"] == "assistant":
self.messages.pop(1)
if precise:
cur_tokens = self.calc_tokens()
else:
cur_tokens = cur_tokens - max_tokens
break
elif len(self.messages) == 2 and self.messages[1]["role"] == "user":
logger.warn("user message exceed max_tokens. total_tokens={}".format(cur_tokens))
break
else:
logger.debug("max_tokens={}, total_tokens={}, len(messages)={}".format(
max_tokens, cur_tokens, len(self.messages)))
break
if precise:
cur_tokens = self.calc_tokens()
else:
cur_tokens = cur_tokens - max_tokens
return cur_tokens
def calc_tokens(self):
return num_tokens_from_messages(self.messages, self.model)
def num_tokens_from_messages(messages, model):
tokens = 0
for msg in messages:
content = msg.get("content", "")
if isinstance(content, str):
tokens += len(content)
elif isinstance(content, list):
for block in content:
if isinstance(block, dict):
tokens += len(block.get("text", ""))
return tokens

View File

@@ -428,11 +428,12 @@ class CowCliPlugin(Plugin):
@staticmethod
def _resolve_bot_type_for_model(model_name: str) -> str:
"""Resolve bot_type from model name, reusing AgentBridge mapping."""
"""Resolve bot_type from model name, matching AgentBridge mapping."""
from common import const
_EXACT = {
"wenxin": const.BAIDU, "wenxin-4": const.BAIDU,
"xunfei": const.XUNFEI, const.QWEN: const.QWEN_DASHSCOPE,
const.QIANFAN: const.QIANFAN,
const.MODELSCOPE: const.MODELSCOPE,
const.MOONSHOT: const.MOONSHOT,
"moonshot-v1-8k": const.MOONSHOT, "moonshot-v1-32k": const.MOONSHOT,
@@ -445,6 +446,7 @@ class CowCliPlugin(Plugin):
("claude", const.CLAUDEAPI),
("moonshot", const.MOONSHOT), ("kimi", const.MOONSHOT),
("doubao", const.DOUBAO), ("deepseek", const.DEEPSEEK),
("ernie", const.QIANFAN),
]
if not model_name:
return const.OPENAI
@@ -454,8 +456,9 @@ class CowCliPlugin(Plugin):
return const.MiniMax
if model_name in [const.QWEN_TURBO, const.QWEN_PLUS, const.QWEN_MAX]:
return const.QWEN_DASHSCOPE
lowered_model = model_name.lower()
for prefix, btype in _PREFIX:
if model_name.startswith(prefix):
if lowered_model.startswith(prefix):
return btype
return const.OPENAI

View File

@@ -44,6 +44,7 @@ class StoryTeller:
@plugins.register(
name="Dungeon",
desire_priority=0,
enabled=False,
namecn="文字冒险",
desc="A plugin to play dungeon game",
version="1.0",

View File

@@ -315,7 +315,7 @@ class Godcmd(Plugin):
except Exception as e:
ok, result = False, "你没有设置私有GPT模型"
elif cmd == "reset":
if bottype in [const.OPEN_AI, const.OPENAI, const.CHATGPT, const.CHATGPTONAZURE, const.LINKAI, const.BAIDU, const.XUNFEI, const.QWEN, const.QWEN_DASHSCOPE, const.GEMINI, const.ZHIPU_AI, const.CLAUDEAPI]:
if bottype in [const.OPEN_AI, const.OPENAI, const.CHATGPT, const.CHATGPTONAZURE, const.LINKAI, const.BAIDU, const.QIANFAN, const.XUNFEI, const.QWEN, const.QWEN_DASHSCOPE, const.GEMINI, const.ZHIPU_AI, const.CLAUDEAPI]:
bot.sessions.clear_session(session_id)
if Bridge().chat_bots.get(bottype):
Bridge().chat_bots.get(bottype).sessions.clear_session(session_id)
@@ -341,7 +341,7 @@ class Godcmd(Plugin):
ok, result = True, "配置已重载"
elif cmd == "resetall":
if bottype in [const.OPEN_AI, const.OPENAI, const.CHATGPT, const.CHATGPTONAZURE, const.LINKAI,
const.BAIDU, const.XUNFEI, const.QWEN, const.QWEN_DASHSCOPE, const.GEMINI, const.ZHIPU_AI, const.MOONSHOT,
const.BAIDU, const.QIANFAN, const.XUNFEI, const.QWEN, const.QWEN_DASHSCOPE, const.GEMINI, const.ZHIPU_AI, const.MOONSHOT,
const.MODELSCOPE]:
channel.cancel_all_session()
bot.sessions.clear_all_session()

View File

@@ -13,6 +13,7 @@ from config import conf
name="Hello",
desire_priority=-1,
hidden=True,
enabled=False,
desc="A simple plugin that says hello",
version="0.1",
author="lanvent",

View File

@@ -34,7 +34,9 @@ class PluginManager:
plugincls.version = kwargs.get("version") if kwargs.get("version") != None else "1.0"
plugincls.namecn = kwargs.get("namecn") if kwargs.get("namecn") != None else name
plugincls.hidden = kwargs.get("hidden") if kwargs.get("hidden") != None else False
plugincls.enabled = True
# enabled 默认 True示例性插件可在装饰器中显式传 enabled=False
# 首次启动写入 plugins.json 时即为关闭状态,避免拦截用户消息。
plugincls.enabled = kwargs.get("enabled", True)
if self.current_plugin_path == None:
raise Exception("Plugin path not set")
self.plugins[name.upper()] = plugincls

View File

@@ -99,7 +99,7 @@ class Role(Plugin):
if e_context["context"].type != ContextType.TEXT:
return
btype = Bridge().get_bot_type("chat")
if btype not in [const.OPEN_AI, const.OPENAI, const.CHATGPT, const.CHATGPTONAZURE, const.QWEN_DASHSCOPE, const.XUNFEI, const.BAIDU, const.ZHIPU_AI, const.MOONSHOT, const.MiniMax, const.LINKAI, const.MODELSCOPE]:
if btype not in [const.OPEN_AI, const.OPENAI, const.CHATGPT, const.CHATGPTONAZURE, const.QWEN_DASHSCOPE, const.XUNFEI, const.BAIDU, const.QIANFAN, const.ZHIPU_AI, const.MOONSHOT, const.MiniMax, const.LINKAI, const.MODELSCOPE]:
logger.debug(f'不支持的bot: {btype}')
return
bot = Bridge().get_bot("chat")

View File

@@ -1,4 +1,3 @@
openai==0.27.8
aiohttp>=3.8.6,<3.10
requests>=2.28.2
chardet>=5.1.0
@@ -19,8 +18,8 @@ zai-sdk
# tongyi qwen sdk
dashscope
# feishu websocket mode
lark-oapi
# feishu
lark-oapi>=1.5.5
# dingtalk
dingtalk_stream
# wecom bot websocket mode

View File

@@ -0,0 +1,601 @@
# encoding:utf-8
import os
import sys
import unittest
from unittest.mock import MagicMock, patch
sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))
class TestQianfanConstantsAndRouting(unittest.TestCase):
def test_qianfan_provider_constant_defined(self):
from common import const
self.assertEqual(const.QIANFAN, "qianfan")
def test_ernie_constants_are_in_model_list(self):
from common import const
self.assertEqual(const.ERNIE_45_TURBO_128K, "ernie-4.5-turbo-128k")
self.assertEqual(const.ERNIE_45_TURBO_32K, "ernie-4.5-turbo-32k")
self.assertEqual(const.ERNIE_X1_1, "ernie-x1.1")
self.assertEqual(
const.ERNIE_45_TURBO_VL,
"ernie-4.5-turbo-vl",
)
self.assertEqual(
const.ERNIE_45_TURBO_VL_32K,
"ernie-4.5-turbo-vl-32k",
)
self.assertIn(const.QIANFAN, const.MODEL_LIST)
self.assertIn(const.ERNIE_45_TURBO_128K, const.MODEL_LIST)
self.assertIn(const.ERNIE_45_TURBO_32K, const.MODEL_LIST)
self.assertIn(const.ERNIE_X1_1, const.MODEL_LIST)
self.assertIn(const.ERNIE_45_TURBO_VL, const.MODEL_LIST)
self.assertIn(const.ERNIE_45_TURBO_VL_32K, const.MODEL_LIST)
def test_qianfan_config_keys_are_available(self):
import config
self.assertIn("qianfan_api_key", config.available_setting)
self.assertIn("qianfan_api_base", config.available_setting)
def test_agent_bridge_routes_ernie_models_to_qianfan(self):
from bridge.agent_bridge import AgentLLMModel
from common import const
model = AgentLLMModel.__new__(AgentLLMModel)
fake_conf = MagicMock()
fake_conf.get.side_effect = lambda key, default=None: {
"use_linkai": False,
"linkai_api_key": "",
"bot_type": "",
}.get(key, default)
with patch("bridge.agent_bridge.conf", return_value=fake_conf):
self.assertEqual(
AgentLLMModel._resolve_bot_type(model, "ernie-4.5-turbo-128k"),
const.QIANFAN,
)
self.assertEqual(
AgentLLMModel._resolve_bot_type(model, "qianfan"),
const.QIANFAN,
)
def test_cow_cli_routes_ernie_models_to_qianfan(self):
from common import const
import plugins
old_plugin_path = plugins.instance.current_plugin_path
cow_cli_was_registered = "COW_CLI" in plugins.instance.plugins
old_cow_cli_plugin = plugins.instance.plugins.get("COW_CLI")
parent_had_cow_cli = hasattr(plugins, "cow_cli")
old_parent_cow_cli = getattr(plugins, "cow_cli", None)
module_names = ("plugins.cow_cli", "plugins.cow_cli.cow_cli")
old_modules = {
name: sys.modules[name]
for name in module_names
if name in sys.modules
}
plugins.instance.current_plugin_path = os.path.join(
os.path.dirname(__file__), "..", "plugins", "cow_cli"
)
try:
import plugins.cow_cli.cow_cli
cow_cli_plugin = plugins.instance.plugins["COW_CLI"]
finally:
plugins.instance.current_plugin_path = old_plugin_path
if cow_cli_was_registered:
plugins.instance.plugins["COW_CLI"] = old_cow_cli_plugin
else:
plugins.instance.plugins.pop("COW_CLI", None)
for name in module_names:
if name in old_modules:
sys.modules[name] = old_modules[name]
else:
sys.modules.pop(name, None)
if parent_had_cow_cli:
plugins.cow_cli = old_parent_cow_cli
elif hasattr(plugins, "cow_cli"):
delattr(plugins, "cow_cli")
self.assertEqual(
cow_cli_plugin._resolve_bot_type_for_model("ernie-4.5-turbo-128k"),
const.QIANFAN,
)
self.assertEqual(
cow_cli_plugin._resolve_bot_type_for_model("qianfan"),
const.QIANFAN,
)
class TestQianfanBot(unittest.TestCase):
def _fake_conf(self, values=None):
data = {
"model": "ernie-5.0",
"qianfan_api_key": "test-qianfan-key",
"qianfan_api_base": "https://qianfan.baidubce.com/v2",
"temperature": 0.7,
"top_p": 1.0,
"frequency_penalty": 0.0,
"presence_penalty": 0.0,
"request_timeout": 180,
"clear_memory_commands": ["#清除记忆"],
"conversation_max_tokens": 1000,
"expires_in_seconds": 3600,
}
if values:
data.update(values)
fake_conf = MagicMock()
fake_conf.get.side_effect = lambda key, default=None: data.get(key, default)
return fake_conf
def test_bot_factory_returns_qianfan_bot(self):
from common import const
from models.bot_factory import create_bot
fake_conf = self._fake_conf()
with patch("models.qianfan.qianfan_bot.conf", return_value=fake_conf):
with patch("models.qianfan.qianfan_bot.SessionManager"):
bot = create_bot(const.QIANFAN)
from models.qianfan.qianfan_bot import QianfanBot
self.assertIsInstance(bot, QianfanBot)
def test_default_model_uses_ernie_when_model_is_provider_alias(self):
fake_conf = self._fake_conf({"model": "qianfan"})
with patch("models.qianfan.qianfan_bot.conf", return_value=fake_conf):
with patch("models.qianfan.qianfan_bot.SessionManager"):
from models.qianfan.qianfan_bot import QianfanBot
bot = QianfanBot()
self.assertEqual(bot.args["model"], "ernie-5.0")
def test_reply_text_posts_openai_compatible_payload(self):
fake_conf = self._fake_conf()
fake_response = MagicMock()
fake_response.status_code = 200
fake_response.json.return_value = {
"choices": [{"message": {"content": "你好,我是文心。"}}],
"usage": {"total_tokens": 12, "completion_tokens": 6},
}
session = MagicMock()
session.messages = [{"role": "user", "content": "你好"}]
with patch("models.qianfan.qianfan_bot.conf", return_value=fake_conf):
with patch("models.qianfan.qianfan_bot.SessionManager"):
from models.qianfan.qianfan_bot import QianfanBot
bot = QianfanBot()
with patch("models.qianfan.qianfan_bot.requests.post", return_value=fake_response) as post:
result = bot.reply_text(session)
self.assertEqual(result["content"], "你好,我是文心。")
self.assertEqual(result["total_tokens"], 12)
self.assertEqual(result["completion_tokens"], 6)
post.assert_called_once()
url = post.call_args.args[0]
kwargs = post.call_args.kwargs
self.assertEqual(url, "https://qianfan.baidubce.com/v2/chat/completions")
self.assertEqual(kwargs["headers"]["Authorization"], "Bearer test-qianfan-key")
self.assertEqual(kwargs["json"]["model"], "ernie-5.0")
self.assertEqual(kwargs["json"]["messages"], [{"role": "user", "content": "你好"}])
def test_reply_text_returns_auth_error_for_401(self):
fake_conf = self._fake_conf()
fake_response = MagicMock()
fake_response.status_code = 401
fake_response.json.return_value = {"error": {"message": "invalid api key"}}
fake_response.text = '{"error":{"message":"invalid api key"}}'
session = MagicMock()
session.messages = [{"role": "user", "content": "你好"}]
with patch("models.qianfan.qianfan_bot.conf", return_value=fake_conf):
with patch("models.qianfan.qianfan_bot.SessionManager"):
from models.qianfan.qianfan_bot import QianfanBot
bot = QianfanBot()
with patch("models.qianfan.qianfan_bot.requests.post", return_value=fake_response):
result = bot.reply_text(session)
self.assertEqual(result["completion_tokens"], 0)
self.assertEqual(result["content"], "授权失败,请检查 Qianfan API Key 是否正确")
def test_reply_text_returns_raw_message_for_non_json_error(self):
fake_conf = self._fake_conf()
fake_response = MagicMock()
fake_response.status_code = 400
fake_response.json.side_effect = ValueError
fake_response.text = "bad gateway text"
session = MagicMock()
session.messages = [{"role": "user", "content": "你好"}]
with patch("models.qianfan.qianfan_bot.conf", return_value=fake_conf):
with patch("models.qianfan.qianfan_bot.SessionManager"):
from models.qianfan.qianfan_bot import QianfanBot
bot = QianfanBot()
with patch("models.qianfan.qianfan_bot.requests.post", return_value=fake_response) as post:
result = bot.reply_text(session)
self.assertEqual(result["completion_tokens"], 0)
self.assertEqual(result["content"], "请求失败bad gateway text")
post.assert_called_once()
def test_qianfan_bot_supports_vision_for_multimodal_models(self):
for model in ("ernie-5.0", "ernie-x1.1", "ernie-4.5-turbo-vl", "ernie-4.5-turbo-vl-32k"):
fake_conf = self._fake_conf({"model": model})
with patch("models.qianfan.qianfan_bot.conf", return_value=fake_conf):
with patch("models.qianfan.qianfan_bot.SessionManager"):
from models.qianfan.qianfan_bot import QianfanBot
bot = QianfanBot()
self.assertTrue(
bot.supports_vision,
msg=f"{model} should be marked as multimodal",
)
def test_qianfan_bot_does_not_advertise_vision_for_text_only_models(self):
for model in ("ernie-4.5-turbo-128k", "ernie-4.5-turbo-32k"):
fake_conf = self._fake_conf({"model": model})
with patch("models.qianfan.qianfan_bot.conf", return_value=fake_conf):
with patch("models.qianfan.qianfan_bot.SessionManager"):
from models.qianfan.qianfan_bot import QianfanBot
bot = QianfanBot()
self.assertFalse(
bot.supports_vision,
msg=f"{model} should not be marked as multimodal",
)
def test_call_vision_posts_openai_compatible_multimodal_payload(self):
fake_conf = self._fake_conf()
fake_response = MagicMock()
fake_response.status_code = 200
fake_response.json.return_value = {
"id": "chatcmpl-test",
"model": "ernie-4.5-turbo-vl",
"choices": [{"message": {"content": "图中有一个红色方块。"}}],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 8,
"total_tokens": 18,
},
}
with patch("models.qianfan.qianfan_bot.conf", return_value=fake_conf):
with patch("models.qianfan.qianfan_bot.SessionManager"):
from models.qianfan.qianfan_bot import QianfanBot
bot = QianfanBot()
with patch("models.qianfan.qianfan_bot.requests.post", return_value=fake_response) as post:
result = bot.call_vision(
image_url="data:image/png;base64,AAAA",
question="这张图里有什么?",
)
self.assertEqual(result["content"], "图中有一个红色方块。")
self.assertEqual(result["model"], "ernie-4.5-turbo-vl")
self.assertEqual(result["usage"]["total_tokens"], 18)
post.assert_called_once()
url = post.call_args.args[0]
kwargs = post.call_args.kwargs
self.assertEqual(url, "https://qianfan.baidubce.com/v2/chat/completions")
self.assertEqual(kwargs["headers"]["Authorization"], "Bearer test-qianfan-key")
self.assertEqual(kwargs["json"]["model"], "ernie-4.5-turbo-vl")
self.assertEqual(kwargs["json"]["max_tokens"], 1000)
self.assertEqual(kwargs["json"]["messages"], [
{
"role": "user",
"content": [
{"type": "text", "text": "这张图里有什么?"},
{
"type": "image_url",
"image_url": {"url": "data:image/png;base64,AAAA"},
},
],
}
])
def test_call_vision_allows_explicit_model_override(self):
fake_conf = self._fake_conf()
fake_response = MagicMock()
fake_response.status_code = 200
fake_response.json.return_value = {
"model": "ernie-4.5-turbo-vl-32k",
"choices": [{"message": {"content": "有文字。"}}],
"usage": {},
}
with patch("models.qianfan.qianfan_bot.conf", return_value=fake_conf):
with patch("models.qianfan.qianfan_bot.SessionManager"):
from models.qianfan.qianfan_bot import QianfanBot
bot = QianfanBot()
with patch("models.qianfan.qianfan_bot.requests.post", return_value=fake_response) as post:
result = bot.call_vision(
image_url="data:image/jpeg;base64,BBBB",
question="识别文字",
model="ernie-4.5-turbo-vl-32k",
max_tokens=256,
)
self.assertEqual(result["model"], "ernie-4.5-turbo-vl-32k")
self.assertEqual(post.call_args.kwargs["json"]["model"], "ernie-4.5-turbo-vl-32k")
self.assertEqual(post.call_args.kwargs["json"]["max_tokens"], 256)
def test_call_vision_returns_error_dict_for_api_error(self):
fake_conf = self._fake_conf()
fake_response = MagicMock()
fake_response.status_code = 400
fake_response.json.return_value = {"error": {"message": "bad image"}}
fake_response.text = '{"error":{"message":"bad image"}}'
with patch("models.qianfan.qianfan_bot.conf", return_value=fake_conf):
with patch("models.qianfan.qianfan_bot.SessionManager"):
from models.qianfan.qianfan_bot import QianfanBot
bot = QianfanBot()
with patch("models.qianfan.qianfan_bot.requests.post", return_value=fake_response):
result = bot.call_vision(
image_url="data:image/png;base64,AAAA",
question="这张图里有什么?",
)
self.assertTrue(result["error"])
self.assertEqual(result["message"], "请求失败bad image")
class TestQianfanSurfaces(unittest.TestCase):
def _read(self, relative_path):
root = os.path.join(os.path.dirname(__file__), "..")
with open(os.path.join(root, relative_path), encoding="utf-8") as f:
return f.read()
def test_web_console_registers_qianfan_provider(self):
source = self._read("channel/web/web_channel.py")
self.assertIn('("qianfan", {', source)
self.assertIn('"label": "百度千帆"', source)
self.assertIn('"api_key_field": "qianfan_api_key"', source)
self.assertIn('"api_base_key": "qianfan_api_base"', source)
self.assertIn('"api_base_default": "https://qianfan.baidubce.com/v2"', source)
def test_web_console_allows_qianfan_config_edits(self):
source = self._read("channel/web/web_channel.py")
self.assertIn('"qianfan_api_base"', source)
self.assertIn('"qianfan_api_key"', source)
def test_session_plugins_allow_qianfan(self):
role_source = self._read("plugins/role/role.py")
godcmd_source = self._read("plugins/godcmd/godcmd.py")
self.assertIn("const.QIANFAN", role_source)
self.assertIn("const.QIANFAN", godcmd_source)
class TestQianfanVisionTool(unittest.TestCase):
def _fake_conf(self, values=None):
data = {
"model": "deepseek-v4-flash",
"qianfan_api_key": "",
"qianfan_api_base": "https://qianfan.baidubce.com/v2",
"open_ai_api_key": "",
"linkai_api_key": "",
"use_linkai": False,
"tool": {},
}
if values:
data.update(values)
fake_conf = MagicMock()
fake_conf.get.side_effect = lambda key, default=None: data.get(key, default)
return fake_conf
def test_vision_auto_discovers_qianfan_when_key_configured(self):
fake_conf = self._fake_conf({"qianfan_api_key": "test-qianfan-key"})
fake_bot = MagicMock()
fake_bot.call_vision = MagicMock()
with patch("agent.tools.vision.vision.conf", return_value=fake_conf):
with patch("models.bot_factory.create_bot", return_value=fake_bot) as create_bot:
from agent.tools.vision.vision import Vision
from common import const
tool = Vision()
tool.model = None
providers = tool._resolve_providers()
self.assertEqual(providers[0].name, "Qianfan")
self.assertEqual(providers[0].model_override, const.ERNIE_45_TURBO_VL)
self.assertTrue(providers[0].use_bot)
create_bot.assert_called_with(const.QIANFAN)
def test_vision_routes_ernie_model_override_to_qianfan(self):
fake_conf = self._fake_conf({
"qianfan_api_key": "test-qianfan-key",
"tool": {"vision": {"model": "ernie-4.5-turbo-vl-32k"}},
})
fake_bot = MagicMock()
fake_bot.call_vision = MagicMock()
with patch("agent.tools.vision.vision.conf", return_value=fake_conf):
with patch("models.bot_factory.create_bot", return_value=fake_bot):
from agent.tools.vision.vision import Vision
tool = Vision()
tool.model = None
providers = tool._resolve_providers()
self.assertEqual(providers[0].name, "Qianfan")
self.assertEqual(providers[0].model_override, "ernie-4.5-turbo-vl-32k")
def test_vision_main_model_uses_qianfan_when_configured_model_is_ernie(self):
fake_conf = self._fake_conf({"model": "ernie-4.5-turbo-vl-32k"})
from common import const
fake_model = MagicMock()
fake_model._resolve_bot_type.return_value = const.QIANFAN
fake_model.bot = MagicMock()
fake_model.bot.supports_vision = True
fake_model.bot.call_vision = MagicMock()
with patch("agent.tools.vision.vision.conf", return_value=fake_conf):
from agent.tools.vision.vision import Vision
tool = Vision()
tool.model = fake_model
providers = tool._resolve_providers()
self.assertEqual(providers[0].name, "MainModel")
self.assertEqual(providers[0].model_override, "ernie-4.5-turbo-vl-32k")
def test_vision_main_model_uses_ernie_5_directly(self):
"""ERNIE 5.0 is omni-modal → main-model path forwards image to it."""
fake_conf = self._fake_conf({"model": "ernie-5.0"})
from common import const
fake_model = MagicMock()
fake_model._resolve_bot_type.return_value = const.QIANFAN
fake_model.bot = MagicMock()
fake_model.bot.supports_vision = True
fake_model.bot.call_vision = MagicMock()
with patch("agent.tools.vision.vision.conf", return_value=fake_conf):
from agent.tools.vision.vision import Vision
tool = Vision()
tool.model = fake_model
providers = tool._resolve_providers()
self.assertEqual(providers[0].name, "MainModel")
self.assertEqual(providers[0].model_override, "ernie-5.0")
def test_vision_falls_back_to_qianfan_vl_when_main_model_is_text_only_ernie(self):
"""Text-only ERNIE (e.g. ernie-4.5-turbo-128k) must NOT receive image
payloads — Vision should skip MainModel and pick up the Qianfan
provider from _DISCOVERABLE_MODELS instead."""
fake_conf = self._fake_conf({
"model": "ernie-4.5-turbo-128k",
"qianfan_api_key": "test-qianfan-key",
})
from common import const
# Main bot reports supports_vision=False because the configured
# model is text-only.
fake_main_bot = MagicMock()
fake_main_bot.supports_vision = False
fake_main_bot.call_vision = MagicMock()
fake_model = MagicMock()
fake_model._resolve_bot_type.return_value = const.QIANFAN
fake_model.bot = fake_main_bot
# The discoverable Qianfan provider creates a new bot via factory.
fake_factory_bot = MagicMock()
fake_factory_bot.call_vision = MagicMock()
with patch("agent.tools.vision.vision.conf", return_value=fake_conf):
with patch("models.bot_factory.create_bot", return_value=fake_factory_bot):
from agent.tools.vision.vision import Vision
tool = Vision()
tool.model = fake_model
providers = tool._resolve_providers()
# MainModel must be absent; Qianfan fallback provider must be the
# first choice and pinned to the dedicated vision model.
names = [p.name for p in providers]
self.assertNotIn("MainModel", names)
self.assertEqual(names[0], "Qianfan")
self.assertEqual(providers[0].model_override, const.ERNIE_45_TURBO_VL)
def test_vision_prefers_same_vendor_fallback_over_other_configured_keys(self):
"""When the main bot is text-only ERNIE and several vision-capable
keys are configured, the same-vendor (Qianfan) fallback wins over
unrelated providers regardless of declaration order."""
fake_conf = self._fake_conf({
"model": "ernie-4.5-turbo-128k",
"qianfan_api_key": "test-qianfan-key",
"ark_api_key": "test-ark-key",
"claude_api_key": "test-claude-key",
"minimax_api_key": "test-minimax-key",
})
from common import const
fake_main_bot = MagicMock()
fake_main_bot.supports_vision = False
fake_main_bot.call_vision = MagicMock()
fake_model = MagicMock()
fake_model._resolve_bot_type.return_value = const.QIANFAN
fake_model.bot = fake_main_bot
fake_factory_bot = MagicMock()
fake_factory_bot.call_vision = MagicMock()
with patch("agent.tools.vision.vision.conf", return_value=fake_conf):
with patch("models.bot_factory.create_bot", return_value=fake_factory_bot):
from agent.tools.vision.vision import Vision
tool = Vision()
tool.model = fake_model
providers = tool._resolve_providers()
names = [p.name for p in providers]
self.assertEqual(names[0], "Qianfan")
self.assertEqual(providers[0].model_override, const.ERNIE_45_TURBO_VL)
# Other configured providers should still appear in the chain.
for expected in ("Doubao", "Claude", "MiniMax"):
self.assertIn(expected, names)
class TestQianfanDocs(unittest.TestCase):
def _read(self, relative_path):
root = os.path.join(os.path.dirname(__file__), "..")
with open(os.path.join(root, relative_path), encoding="utf-8") as f:
return f.read()
def test_qianfan_docs_exist_in_all_doc_locales(self):
for path in (
"docs/models/qianfan.mdx",
"docs/en/models/qianfan.mdx",
"docs/ja/models/qianfan.mdx",
):
text = self._read(path)
self.assertIn("qianfan_api_key", text)
self.assertIn("https://qianfan.baidubce.com/v2", text)
self.assertIn("ernie-4.5-turbo-128k", text)
self.assertIn("ernie-4.5-turbo-vl", text)
def test_model_indexes_link_qianfan(self):
for path in (
"docs/models/index.mdx",
"docs/en/models/index.mdx",
"docs/ja/models/index.mdx",
):
text = self._read(path)
self.assertIn('/models/qianfan', text)
def test_readme_documents_native_qianfan_provider(self):
text = self._read("README.md")
self.assertIn('"model": "ernie-5.0"', text)
self.assertIn('"qianfan_api_key": ""', text)
self.assertIn('"qianfan_api_base": "https://qianfan.baidubce.com/v2"', text)
def test_vision_docs_document_qianfan_provider(self):
expected = {
"docs/tools/vision.mdx": "百度千帆",
"docs/en/tools/vision.mdx": "Baidu Qianfan",
"docs/ja/tools/vision.mdx": "Baidu Qianfan",
}
for path, label in expected.items():
text = self._read(path)
self.assertIn(label, text)
self.assertIn("ernie-4.5-turbo-vl", text)
if __name__ == "__main__":
unittest.main()

View File

@@ -0,0 +1,260 @@
# encoding:utf-8
"""
Unit tests for the Youdao translator integration:
- YoudaoTranslator class behavior (signature, language code mapping,
request/response handling, error handling).
- translate.factory.create_translator dispatch and error message.
"""
import os
import sys
import unittest
from hashlib import sha256
from unittest.mock import MagicMock, patch
sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))
def _mock_conf(**values):
"""Build a callable that mimics config.conf() returning the provided dict."""
cfg = MagicMock()
cfg.get = MagicMock(side_effect=lambda key, default=None: values.get(key, default))
return MagicMock(return_value=cfg)
class TestYoudaoTranslatorInit(unittest.TestCase):
def test_init_success(self):
with patch(
"translate.youdao.youdao_translate.conf",
_mock_conf(
youdao_translate_app_key="key123",
youdao_translate_app_secret="secret456",
),
):
from translate.youdao.youdao_translate import YoudaoTranslator
translator = YoudaoTranslator()
self.assertEqual(translator.app_key, "key123")
self.assertEqual(translator.app_secret, "secret456")
def test_init_missing_credentials_raises(self):
with patch(
"translate.youdao.youdao_translate.conf",
_mock_conf(youdao_translate_app_key="", youdao_translate_app_secret=""),
):
from translate.youdao.youdao_translate import YoudaoTranslator
with self.assertRaises(Exception) as ctx:
YoudaoTranslator()
self.assertIn("youdao", str(ctx.exception).lower())
class TestYoudaoTranslatorHelpers(unittest.TestCase):
def test_truncate_input_short(self):
from translate.youdao.youdao_translate import YoudaoTranslator
# length <= 20 -> returned as-is
self.assertEqual(YoudaoTranslator._truncate_input("hello"), "hello")
self.assertEqual(YoudaoTranslator._truncate_input("a" * 20), "a" * 20)
def test_truncate_input_long(self):
from translate.youdao.youdao_translate import YoudaoTranslator
# length > 20 -> first 10 + len + last 10
text = "abcdefghij" + "X" * 5 + "1234567890" # 25 chars
result = YoudaoTranslator._truncate_input(text)
self.assertEqual(result, "abcdefghij" + "25" + "1234567890")
def test_truncate_input_exactly_21(self):
from translate.youdao.youdao_translate import YoudaoTranslator
text = "a" * 21
result = YoudaoTranslator._truncate_input(text)
# first 10 'a' + "21" + last 10 'a'
self.assertEqual(result, "a" * 10 + "21" + "a" * 10)
def test_convert_lang_known_codes(self):
from translate.youdao.youdao_translate import YoudaoTranslator
self.assertEqual(YoudaoTranslator._convert_lang(""), "auto")
self.assertEqual(YoudaoTranslator._convert_lang("auto"), "auto")
self.assertEqual(YoudaoTranslator._convert_lang("zh"), "zh-CHS")
self.assertEqual(YoudaoTranslator._convert_lang("zh-CN"), "zh-CHS")
self.assertEqual(YoudaoTranslator._convert_lang("zh-TW"), "zh-CHT")
def test_convert_lang_passthrough(self):
from translate.youdao.youdao_translate import YoudaoTranslator
# unknown codes pass through unchanged (Youdao accepts ISO codes for many langs)
self.assertEqual(YoudaoTranslator._convert_lang("en"), "en")
self.assertEqual(YoudaoTranslator._convert_lang("ja"), "ja")
self.assertEqual(YoudaoTranslator._convert_lang("fr"), "fr")
def test_convert_lang_none(self):
from translate.youdao.youdao_translate import YoudaoTranslator
self.assertEqual(YoudaoTranslator._convert_lang(None), "auto")
def test_build_sign_matches_v3_spec(self):
with patch(
"translate.youdao.youdao_translate.conf",
_mock_conf(
youdao_translate_app_key="appKey",
youdao_translate_app_secret="appSecret",
),
):
from translate.youdao.youdao_translate import YoudaoTranslator
translator = YoudaoTranslator()
query = "hello"
salt = "saltvalue"
curtime = "1700000000"
expected = sha256(
("appKey" + "hello" + "saltvalue" + "1700000000" + "appSecret").encode("utf-8")
).hexdigest()
self.assertEqual(translator._build_sign(query, salt, curtime), expected)
class TestYoudaoTranslatorTranslate(unittest.TestCase):
def _make_translator(self):
with patch(
"translate.youdao.youdao_translate.conf",
_mock_conf(
youdao_translate_app_key="appKey",
youdao_translate_app_secret="appSecret",
),
):
from translate.youdao.youdao_translate import YoudaoTranslator
return YoudaoTranslator()
def test_translate_success(self):
translator = self._make_translator()
mock_response = MagicMock()
mock_response.json.return_value = {
"errorCode": "0",
"translation": ["你好"],
"query": "hello",
"l": "en2zh-CHS",
}
mock_response.raise_for_status = MagicMock()
with patch(
"translate.youdao.youdao_translate.requests.post",
return_value=mock_response,
) as mock_post:
result = translator.translate("hello", from_lang="en", to_lang="zh")
self.assertEqual(result, "你好")
mock_post.assert_called_once()
# Check posted payload contains the right language codes
call_kwargs = mock_post.call_args.kwargs
payload = call_kwargs["data"]
self.assertEqual(payload["q"], "hello")
self.assertEqual(payload["from"], "en")
self.assertEqual(payload["to"], "zh-CHS")
self.assertEqual(payload["appKey"], "appKey")
self.assertEqual(payload["signType"], "v3")
self.assertIn("salt", payload)
self.assertIn("sign", payload)
self.assertIn("curtime", payload)
def test_translate_multiline_joins_with_newlines(self):
translator = self._make_translator()
mock_response = MagicMock()
mock_response.json.return_value = {
"errorCode": "0",
"translation": ["line one", "line two"],
}
mock_response.raise_for_status = MagicMock()
with patch(
"translate.youdao.youdao_translate.requests.post",
return_value=mock_response,
):
result = translator.translate("multi\nline")
self.assertEqual(result, "line one\nline two")
def test_translate_empty_query_returns_empty(self):
translator = self._make_translator()
# Should not even hit the network for an empty query
with patch("translate.youdao.youdao_translate.requests.post") as mock_post:
self.assertEqual(translator.translate(""), "")
mock_post.assert_not_called()
def test_translate_error_code_raises(self):
translator = self._make_translator()
mock_response = MagicMock()
mock_response.json.return_value = {
"errorCode": "108",
"msg": "appKey无效",
}
mock_response.raise_for_status = MagicMock()
with patch(
"translate.youdao.youdao_translate.requests.post",
return_value=mock_response,
):
with self.assertRaises(Exception) as ctx:
translator.translate("hello")
msg = str(ctx.exception)
self.assertIn("108", msg)
def test_translate_empty_translation_raises(self):
translator = self._make_translator()
mock_response = MagicMock()
mock_response.json.return_value = {"errorCode": "0", "translation": []}
mock_response.raise_for_status = MagicMock()
with patch(
"translate.youdao.youdao_translate.requests.post",
return_value=mock_response,
):
with self.assertRaises(Exception):
translator.translate("hello")
def test_translate_default_target_language(self):
translator = self._make_translator()
mock_response = MagicMock()
mock_response.json.return_value = {"errorCode": "0", "translation": ["hello"]}
mock_response.raise_for_status = MagicMock()
with patch(
"translate.youdao.youdao_translate.requests.post",
return_value=mock_response,
) as mock_post:
translator.translate("你好") # no from/to provided
payload = mock_post.call_args.kwargs["data"]
self.assertEqual(payload["from"], "auto")
self.assertEqual(payload["to"], "en")
class TestTranslatorFactory(unittest.TestCase):
def test_factory_creates_youdao(self):
with patch(
"translate.youdao.youdao_translate.conf",
_mock_conf(
youdao_translate_app_key="k",
youdao_translate_app_secret="s",
),
):
from translate.factory import create_translator
from translate.youdao.youdao_translate import YoudaoTranslator
translator = create_translator("youdao")
self.assertIsInstance(translator, YoudaoTranslator)
def test_factory_unknown_type_message(self):
from translate.factory import create_translator
with self.assertRaises(RuntimeError) as ctx:
create_translator("nonexistent")
msg = str(ctx.exception)
self.assertIn("nonexistent", msg)
self.assertIn("baidu", msg)
self.assertIn("youdao", msg)
if __name__ == "__main__":
unittest.main()

View File

@@ -1,6 +1,17 @@
def create_translator(voice_type):
if voice_type == "baidu":
SUPPORTED_TRANSLATORS = ("baidu", "youdao")
def create_translator(translator_type):
if translator_type == "baidu":
from translate.baidu.baidu_translate import BaiduTranslator
return BaiduTranslator()
raise RuntimeError
if translator_type == "youdao":
from translate.youdao.youdao_translate import YoudaoTranslator
return YoudaoTranslator()
raise RuntimeError(
"unsupported translator type: {}, supported: {}".format(
translator_type, ", ".join(SUPPORTED_TRANSLATORS)
)
)

View File

@@ -0,0 +1,110 @@
# -*- coding: utf-8 -*-
"""
Youdao translator implementation.
Youdao Translation API v3 documentation:
https://ai.youdao.com/DOCSIRMA/html/trans/api/wbfy/index.html
Configuration keys (in config.json):
youdao_translate_app_key: Application key from Youdao AI platform.
youdao_translate_app_secret: Application secret from Youdao AI platform.
"""
import time
import uuid
from hashlib import sha256
import requests
from config import conf
from translate.translator import Translator
class YoudaoTranslator(Translator):
"""Youdao translator using the v3 signature scheme."""
API_URL = "https://openapi.youdao.com/api"
# Mapping from ISO 639-1 codes (used by the Translator interface)
# to Youdao-specific language codes.
# Reference: https://ai.youdao.com/DOCSIRMA/html/trans/api/wbfy/index.html
LANG_CODE_MAP = {
"": "auto",
"auto": "auto",
"zh": "zh-CHS",
"zh-CN": "zh-CHS",
"zh-TW": "zh-CHT",
"yue": "yue", # Cantonese
}
def __init__(self) -> None:
super().__init__()
self.app_key = conf().get("youdao_translate_app_key")
self.app_secret = conf().get("youdao_translate_app_secret")
if not self.app_key or not self.app_secret:
raise Exception("youdao translate app_key or app_secret not set")
def translate(self, query: str, from_lang: str = "", to_lang: str = "en") -> str:
if not query:
return ""
from_lang_code = self._convert_lang(from_lang) or "auto"
to_lang_code = self._convert_lang(to_lang) or "en"
salt = str(uuid.uuid4())
curtime = str(int(time.time()))
sign = self._build_sign(query, salt, curtime)
payload = {
"q": query,
"from": from_lang_code,
"to": to_lang_code,
"appKey": self.app_key,
"salt": salt,
"sign": sign,
"signType": "v3",
"curtime": curtime,
}
headers = {"Content-Type": "application/x-www-form-urlencoded"}
response = requests.post(self.API_URL, data=payload, headers=headers, timeout=10)
response.raise_for_status()
result = response.json()
error_code = result.get("errorCode", "0")
if error_code != "0":
raise Exception(
"youdao translate error: code={}, msg={}".format(
error_code, result.get("msg", "")
)
)
translations = result.get("translation") or []
if not translations:
raise Exception("youdao translate returned empty translation")
return "\n".join(translations)
def _build_sign(self, query: str, salt: str, curtime: str) -> str:
"""
Build the v3 signature.
sign = sha256(appKey + input + salt + curtime + appSecret),
where input = q if len(q) <= 20 else q[:10] + str(len(q)) + q[-10:].
"""
input_str = self._truncate_input(query)
sign_str = self.app_key + input_str + salt + curtime + self.app_secret
return sha256(sign_str.encode("utf-8")).hexdigest()
@staticmethod
def _truncate_input(query: str) -> str:
length = len(query)
if length <= 20:
return query
return query[:10] + str(length) + query[-10:]
@classmethod
def _convert_lang(cls, lang: str) -> str:
"""Convert ISO 639-1 language code to Youdao-specific code."""
if lang is None:
return "auto"
return cls.LANG_CODE_MAP.get(lang, lang)

View File

@@ -73,9 +73,14 @@ def any_to_wav(any_path, wav_path):
return
if any_path.endswith(".sil") or any_path.endswith(".silk") or any_path.endswith(".slk"):
return sil_to_wav(any_path, wav_path)
audio = AudioSegment.from_file(any_path)
audio.set_frame_rate(8000) # 百度语音转写支持8000采样率, pcm_s16le, 单通道语音识别
audio.set_channels(1)
# pydub 0.23.0+ 会将 parameters 追加到 ffmpeg 命令的输出文件 `-` 之后,
# 因此 -nostdin 可能被当作"尾部选项"处理,是否生效取决于 ffmpeg 版本。
# 目的是防止后台服务中 ffmpeg 子进程继承父进程的 stdin避免死锁。
audio = AudioSegment.from_file(any_path, parameters=["-nostdin"])
# AudioSegment 是不可变对象set_frame_rate/set_channels 返回新对象,不修改原对象。
# 必须将返回值重新赋给 audio否则修改不会生效。
audio = audio.set_frame_rate(16000)
audio = audio.set_channels(1)
audio.export(wav_path, format="wav", codec='pcm_s16le')

View File

@@ -3,8 +3,6 @@ google voice service
"""
import json
import openai
from bridge.reply import Reply, ReplyType
from common.log import logger
from config import conf
@@ -15,7 +13,9 @@ import datetime, random
class OpenaiVoice(Voice):
def __init__(self):
openai.api_key = conf().get("open_ai_api_key")
# No-op: this implementation calls OpenAI HTTP endpoints directly via
# `requests`, so it does not need a global SDK to be configured.
pass
def voiceToText(self, voice_file):
logger.debug("[Openai] voice file name={}".format(voice_file))
@@ -35,10 +35,18 @@ class OpenaiVoice(Voice):
}
response = requests.post(url, headers=headers, files=files, data=data)
response_data = response.json()
text = response_data['text']
reply = Reply(ReplyType.TEXT, text)
logger.info("[Openai] voiceToText text={} voice file name={}".format(text, voice_file))
if response.status_code != 200 or "text" not in response_data:
logger.error(
f"[Openai] voiceToText failed: status={response.status_code}, "
f"resp={response_data}"
)
reply = Reply(ReplyType.ERROR, "我暂时还无法听清您的语音,请稍后再试吧~")
else:
text = response_data["text"]
reply = Reply(ReplyType.TEXT, text)
logger.info("[Openai] voiceToText text={} voice file name={}".format(text, voice_file))
except Exception as e:
logger.error(f"[Openai] voiceToText exception: {e}", exc_info=True)
reply = Reply(ReplyType.ERROR, "我暂时还无法听清您的语音,请稍后再试吧~")
finally:
return reply