Adopt the same channel-level pattern as weixin/wecom_bot/feishu so
the agent actually sees attachments the user sent:
- IMAGE: agent mode never reads memory.USER_IMAGE_CACHE, so a photo
sent before a question (e.g. "image" then 30s later "what's this?")
used to be lost. Now lone images go into channel.file_cache and
the next TEXT turn appends "[图片: <path>]" to the query before
producing the context. Cross-batch image+text combinations now
work as users expect.
- FILE: previously dropped at the sync_msg filter and unsupported
by WechatKfMessage. Add msgtype="file" parsing, download via the
WeCom media API, preserve the original filename from
Content-Disposition (RFC 5987 + plain forms), and route through
the same file_cache pipeline as images, surfacing as
"[文件: <path>]" in the next text turn.
Move the sync_msg cursor file from the project-local tmp/ dir to ~/.wechat_kf_cursors.json so it survives tmp/ cleanups and cwd changes across restarts. Aligns with the weixin channel's credentials file convention.
- add wechat_kf_cursor_path config (default ~/.wechat_kf_cursors.json)
- expand ~ via os.path.expanduser in the channel init
- chmod the cursor file to 0o600 after each flush (no-op on Windows)
_dedup_image_text_pair previously fell back to returning only the last message whenever the batch was not exactly an image+text pair, which silently dropped multiple texts/images sent in quick succession.
Cursor freshness is already guaranteed by sync_msg, so no extra stale-history protection is needed. Now we return all messages by default and only collapse a batch when it is exactly a 2-message image+text pair within a 5s window (order-insensitive, normalized to [image, text]).
WeCom requires the callback HTTP response within ~5s, otherwise it retries the same notification. The previous code ran sync_msg pulling synchronously inside Query.POST, so a backlog could exceed the deadline and trigger retries that race on the same cursor and end up replying to the same user multiple times.
- Dispatch consume_callback to a background ThreadPoolExecutor and return 'success' immediately from the HTTP handler.
- Serialize work per open_kfid with a lock so retried/concurrent callbacks queue up instead of racing the cursor window.
- Shutdown the executor on channel stop().
Rename the WeCom customer-service channel and give it its own corp_id
field so users no longer have to share `wechatcom_corp_id` with the
self-built WeCom app channel.
Renames (channel-side):
- channel type / const: wechatcom_kf -> wechat_kf
- package dir: channel/wechatcom_kf/ -> channel/wechat_kf/
- python files / classes: WechatComKf* -> WechatKf*
- config keys: wechatcom_kf_{secret,token,aes_key,port} ->
wechat_kf_{secret,token,aes_key,port}; new wechat_kf_corp_id
- env vars: WECHATCOM_KF_* -> WECHAT_KF_*; new WECHAT_KF_CORP_ID
- log prefix / cursor file: [wechatcom_kf] -> [wechat_kf]
- web console CHANNEL_DEFS key + startup log line
Renames (docs):
- docs/channels/wecom-kf.mdx -> docs/channels/wechat-kf.mdx (zh/en/ja)
- update docs.json sidebar entries and all field names inside the docs
In addition, the Web Console "微信客服" entry now exposes its own
Corp ID field instead of reusing the wechatcom_app one, and includes
the screenshot of the visual config in the channel guide.
Web Console onboarding section is added (Tabs: Web Console / config
file) and the local URL `http://127.0.0.1:9899/` parenthetical is
dropped for consistency with other channel docs.
Co-authored-by: Cursor <cursoragent@cursor.com>
Add a self-deployment guide for the new `wechatcom_kf` channel under
`docs/channels/wecom-kf.mdx` in zh / en / ja, mirroring the existing
`wecom.mdx` structure. Wire each language version into the sidebar in
`docs/docs.json`.
Walks through: creating the WeCom custom app, retrieving Corp ID /
Secret (push-to-phone) / Token / EncodingAESKey, configuring `config.json`,
saving the callback URL + Enterprise Trusted IPs, binding the WeCom
Customer Service account, and distributing the access link / QR code.
Co-authored-by: Cursor <cursoragent@cursor.com>
- Clarify Secret retrieval (must tap "查看" on admin's phone, not copy)
- Update WeCom customer-service binding section to point to the
"接入链接" UI (copy link / generate QR code)
- Drop developer-only asides (wechatcomapp_secret / port collision
notes, internal sections about cursor persistence, channel runtime
differences, multi-kf-account support)
- Stop exposing `wechatcom_kf_cursor_dir` as a user config; cursor file
is now fixed under `tmp/`, which is an internal implementation detail.
Co-authored-by: Cursor <cursoragent@cursor.com>
- Switch from the local `WechatComAppClient` (whose `fetch_access_token`
may return the raw response dict and whose background refresh loop
re-fetches every 60s) to the stock `wechatpy.enterprise.WeChatClient`.
- Use `client.access_token` (string property) when building sync_msg /
send_msg URLs; the previous `client.fetch_access_token()` call could
interpolate a dict into the URL and yield errcode 40014.
- Always skip historical messages on first start; drop the
`wechatcom_kf_skip_history_on_first_start` config — there is no real
case for replaying up to 14 days of history.
- Change default callback port from 9899 to 9888.
Co-authored-by: Cursor <cursoragent@cursor.com>
Introduce a new channel that integrates with WeCom Customer Service
(微信客服), separate from the existing self-built WeCom app channel.
- Register channel type `wechatcom_kf` in factory, app loader and const
- Add config keys for token / secret / aes_key / port / cursor dir and
the first-start history-skip switch; also expose corresponding env vars
- Implement channel, message and cursor store under channel/wechatcom_kf/
Co-authored-by: Cursor <cursoragent@cursor.com>
Add embedding_provider config knob with native support for
openai / dashscope / doubao / zhipu / linkai, plus an in-chat
/memory status and /memory rebuild-index workflow for switching
vendors safely.
When reloading a conversation, failed tool calls incorrectly showed checkmark instead of X because the is_error field was lost in the history rendering pipeline. Propagate is_error from DB extraction through to the frontend rendering to match the live SSE behavior.
Browser sessions now reuse a Chromium user profile across runs by default
(`~/.cow/browser_profile`), so users only log in to a site once.
Three launch modes are selectable via `tools.browser` in config.json:
- persistent (default): Playwright Chromium with a persistent user_data_dir
- cdp: attach to an externally launched real Chrome via `cdp_endpoint`
(full fingerprints, ideal for sites with strict bot detection)
- fresh: clean context every run, set `persistent: false`
Also:
- Self-heal when the user closes the browser window mid-session: detect
closed page/context/browser via close listeners and exception scanning,
then transparently relaunch on the next request.
- Graceful CDP shutdown: disconnect only, never kill the user's Chrome.
- Friendly errors when the CDP endpoint is unreachable or the persistent
profile is locked, so the LLM can guide the user instead of looping.
- Fix tool config being silently overwritten by workspace config in
AgentInitializer; per-tool user settings (e.g. browser.cdp_endpoint)
are now merged instead of replaced.
- Update zh / en / ja docs with the new login-persistence section,
including the Chrome 137+ requirement to pair --remote-debugging-port
with a dedicated --user-data-dir.
Boot MCP servers (npx/uvx) on a background thread instead of blocking
agent init. Built-in tools serve traffic immediately while MCP comes
online; each new agent reads whatever is ready at creation time.
Idempotent via _mcp_loaded flag — concurrent sessions never re-fork
subprocesses. Per-server failures are isolated and warmup is triggered
in app.py so loading overlaps with channel startup.