feat(qianfan): scope vision support to multimodal models

This commit is contained in:
zhayujie
2026-05-06 16:11:10 +08:00
parent 63f99af1e6
commit a5790d82f6
15 changed files with 212 additions and 50 deletions

View File

@@ -23,7 +23,7 @@ If the current provider fails, the tool automatically tries the next one until i
| Vendor | Vision Model | Notes |
| --- | --- | --- |
| OpenAI / Compatible | Main model | All OpenAI-compatible multimodal models |
| Baidu Qianfan | ernie-4.5-turbo-vl | Auto-discovered when `qianfan_api_key` is configured; can also be selected via `tool.vision.model` |
| Baidu Qianfan | Main model | Multimodal main models (e.g. `ernie-5.0`) handle images directly; falls back to `ernie-4.5-turbo-vl` for text-only main models |
| Qwen (DashScope) | Main model | Via MultiModalConversation API |
| Claude | Main model | Anthropic native image format |
| Gemini | Main model | inlineData format |