Further refinements on top of #2795:
- persist real session_id (notify_session_id) at task creation so group chats
correctly map back to the user's actual conversation
- mark scheduler turns with [SCHEDULED] (recognise legacy "Scheduled task"
prefix too for backward-compatible pruning)
- prune both DB and in-memory to scheduler_inject_max_per_session (default 3),
only marker-tagged pairs are touched; regular user turns never deleted
- send_message type gated by scheduler_inject_send_message (default false) —
fixed reminder text rarely benefits follow-up Q&A
Co-authored-by: huangrichao2020 <grdomai43881@gmail.com>
Address review feedback from #2794:
1. Use notify_session_id instead of receiver for correct group chat mapping
- Task creation should store the real session_id in action.notify_session_id
- Falls back to receiver for backward compatibility with old tasks
2. Add injection to all four execution branches:
- _execute_agent_task
- _execute_send_message
- _execute_tool_call
- _execute_skill_call (also fixed missing channel.send)
3. Add config switch and content truncation:
- scheduler_inject_to_session (default: true) to toggle the feature
- 2000 char limit prevents high-frequency tasks from bloating sessions
Fixes#2793
- Add call_vision method to all bot implementations (DashScope, Claude,
Gemini, ZhipuAI, MiniMax, Doubao, Moonshot, OpenAICompatibleBot)
using each vendor's native multimodal API format
- Remove call_with_tools/call_vision from Bot base class to fix MRO
shadowing issue with OpenAICompatibleBot mixin
- Refactor vision tool provider resolution: MainModel → other configured
models (auto-discovered) → OpenAI → LinkAI, with automatic fallback
- Return actual model name used in call_vision responses
- Sync config.json API keys to .env bidirectionally on startup
- Fix bot instance cache to detect bot_type/use_linkai config changes
- Add SSE reconnection support for web console
- Preserve image path hints in Gemini text for correct vision tool calls
- Update docs/tools/vision.mdx
Browser tool enhancements:
- Navigate action now auto-includes snapshot result, saving one LLM round-trip
- Wait for networkidle + 800ms after navigation for SPA/JS-rendered pages
- Prompt guides agent to screenshot key results and ask user for login/CAPTCHA help
- Fixed playwright version pinned to 1.52.0; mirror fallback to official CDN on failure
Web console file/image support:
- SSE real-time push for images and files via on_event (file_to_send)
- Added /api/file endpoint to serve local files for web preview
- Frontend renders images in media-content container (survives delta/done overwrites)
- File attachment cards with download links; RFC 5987 encoding for non-ASCII filenames
Tool workspace fix:
- Inject workspace_dir as cwd into send and browser tools (previously only file tools)
- Screenshots now save to ~/cow/tmp/ instead of project directory
- Simplify skill-creator installation flow
- Refine skill selection prompt for better matching
- Add parameter alias and env variable hints for tools
- Skip linkai-agent when unconfigured
- Create skills/ dir in workspace on init