Search tool now supports 4 backends with unified output (bocha,
qianfan, zhipu, linkai) and a routing layer:
- strategy 'auto' (default): pick first configured in canonical order
bocha > qianfan > zhipu > linkai
- strategy 'fixed': pin a specific provider
- agent may pass `provider` to override per-call (only exposed when
≥2 providers configured + auto strategy)
- Clear NOT_SUPPORT_REPLYTYPE on weixin, wecom_bot, dingtalk so TTS replies
are actually synthesized for these channels.
- Wire desire_rtype=VOICE in weixin and wecom_bot _compose_context so the
always_reply_voice / voice_reply_voice toggles take effect.
- DingTalk: send native sampleAudio (mediaId + duration). The media API
only accepts ogg/amr, so convert TTS mp3/wav to amr on the fly.
- WeCom Bot: send native voice msgtype via ws (respond + active push),
converting TTS audio to amr before upload.
- Weixin (ilink): no outbound voice item, deliver TTS as a file attachment.
- chat_channel: when a TEXT reply is converted to VOICE, stash original
text in context["voice_reply_text"] and send a text bubble before the
voice reply. Skipped for feishu_streamed and wechatcom_app, which
already render text alongside the voice.
When reloading a conversation, failed tool calls incorrectly showed checkmark instead of X because the is_error field was lost in the history rendering pipeline. Propagate is_error from DB extraction through to the frontend rendering to match the live SSE behavior.
Boot MCP servers (npx/uvx) on a background thread instead of blocking
agent init. Built-in tools serve traffic immediately while MCP comes
online; each new agent reads whatever is ready at creation time.
Idempotent via _mcp_loaded flag — concurrent sessions never re-fork
subprocesses. Per-server failures are isolated and warmup is triggered
in app.py so loading overlaps with channel startup.
- rewrite streaming reply to official cardkit v2.0 API (default on, auto-fallback)
- fix Whisper hallucination: bump ASR sample rate to 16k, pass language=zh
- fix lock-over-IO and tmp file cleanup from #2791
- drop deprecated feishu_bot_name; quiet unknown-key warnings
- docs: cardkit permission and feishu_stream_reply usage
- Receive audio messages: map msg_type=audio to ContextType.VOICE and
download opus file via lazy _prepare_fn for STT pipeline
- Send voice replies: upload opus audio via Feishu file API, auto-convert
non-opus formats (e.g. mp3) using pydub before upload
- Streaming text reply: inject on_event callback into context; send a
card
placeholder on first delta, then PATCH-update it in-place at a
configurable interval (feishu_stream_interval_ms) to achieve typewriter
effect; set feishu_streamed=True to suppress duplicate send()
- Enable NOT_SUPPORT_REPLYTYPE=[] to unblock voice and image reply types
- Fix AudioSegment mutation bug in audio_convert.py: set_frame_rate /
set_channels return new objects and must be reassigned
- Add -nostdin to ffmpeg invocation to prevent stdin deadlock in daemon
- Add feishu_bot_name, feishu_stream_reply, feishu_stream_interval_ms
config keys to config-template.json