diff --git a/README.md b/README.md index e6b479ba..110f1a1a 100644 --- a/README.md +++ b/README.md @@ -604,11 +604,12 @@ API Key 创建:在 [控制台](https://aistudio.google.com/app/apikey?hl=zh-cn ```json { "model": "ernie-5.0", - "qianfan_api_key": "" + "qianfan_api_key": "", + "qianfan_api_base": "https://qianfan.baidubce.com/v2" } ``` - - `model`: 默认推荐填写 `ernie-5.0`,也可填写 `ernie-4.5-turbo-128k`、`ernie-4.5-turbo-32k`、`ernie-x1-turbo-32k` + - `model`: 默认推荐填写 `ernie-5.0`,也可填写 `ernie-4.5-turbo-128k`、`ernie-4.5-turbo-32k`、`ernie-x1-turbo-32k`;Vision 工具可使用 `ernie-4.5-turbo-vl-preview` - `qianfan_api_key`: 百度千帆 API Key,通常以 `bce-v3/` 开头,可在百度智能云控制台创建 - `qianfan_api_base`: 可选,默认为 `https://qianfan.baidubce.com/v2` diff --git a/docs/en/models/qianfan.mdx b/docs/en/models/qianfan.mdx index 10a4e862..1e87a26a 100644 --- a/docs/en/models/qianfan.mdx +++ b/docs/en/models/qianfan.mdx @@ -28,6 +28,20 @@ Option 1: Native integration (recommended): | `ernie-4.5-turbo-32k` | General chat with a balanced context window and cost | | `ernie-x1-turbo-32k` | Tasks that need stronger reasoning | +## Vision tool + +After `qianfan_api_key` is configured, Agent mode can auto-discover Qianfan for the Vision tool. The recommended Qianfan vision model is `ernie-4.5-turbo-vl-preview`: + +```json +{ + "tool": { + "vision": { + "model": "ernie-4.5-turbo-vl-preview" + } + } +} +``` + Option 2: OpenAI-compatible configuration: ```json diff --git a/docs/en/tools/vision.mdx b/docs/en/tools/vision.mdx index 01e36db2..942e1d7e 100644 --- a/docs/en/tools/vision.mdx +++ b/docs/en/tools/vision.mdx @@ -23,6 +23,7 @@ If the current provider fails, the tool automatically tries the next one until i | Vendor | Vision Model | Notes | | --- | --- | --- | | OpenAI / Compatible | Main model | All OpenAI-compatible multimodal models | +| Baidu Qianfan | ernie-4.5-turbo-vl-preview | Auto-discovered when `qianfan_api_key` is configured; can also be selected via `tool.vision.model` | | Qwen (DashScope) | Main model | Via MultiModalConversation API | | Claude | Main model | Anthropic native image format | | Gemini | Main model | inlineData format | @@ -52,7 +53,7 @@ To specify a particular model for the vision tool, add to `config.json`: { "tool": { "vision": { - "model": "gpt-4o" + "model": "ernie-4.5-turbo-vl-preview" } } } diff --git a/docs/ja/models/qianfan.mdx b/docs/ja/models/qianfan.mdx index 5fe11622..cd69d0f7 100644 --- a/docs/ja/models/qianfan.mdx +++ b/docs/ja/models/qianfan.mdx @@ -28,6 +28,20 @@ description: Baidu Qianfan ERNIE モデル設定 | `ernie-4.5-turbo-32k` | コンテキスト長とコストのバランスが良い一般チャット向け | | `ernie-x1-turbo-32k` | より強い推論が必要なタスク向け | +## Vision ツール + +`qianfan_api_key` を設定すると、Agent モードの Vision ツールは Qianfan を自動検出できます。推奨する Qianfan の視覚モデルは `ernie-4.5-turbo-vl-preview` です: + +```json +{ + "tool": { + "vision": { + "model": "ernie-4.5-turbo-vl-preview" + } + } +} +``` + 方法 2: OpenAI 互換接続: ```json diff --git a/docs/ja/tools/vision.mdx b/docs/ja/tools/vision.mdx index 95e28a22..037cc582 100644 --- a/docs/ja/tools/vision.mdx +++ b/docs/ja/tools/vision.mdx @@ -23,6 +23,7 @@ Vision ツールは多段階の自動選択+自動フォールバック戦略 | ベンダー | ビジョンモデル | 説明 | | --- | --- | --- | | OpenAI / 互換プロトコル | メインモデル | すべての OpenAI 互換マルチモーダルモデルに対応 | +| Baidu Qianfan | ernie-4.5-turbo-vl-preview | `qianfan_api_key` を設定すると自動検出され、`tool.vision.model` でも指定できます | | 通義千問 (DashScope) | メインモデル | MultiModalConversation API 経由 | | Claude | メインモデル | Anthropic ネイティブ画像形式 | | Gemini | メインモデル | inlineData 形式 | @@ -52,7 +53,7 @@ Vision ツールで使用するモデルを指定するには、`config.json` { "tool": { "vision": { - "model": "gpt-4o" + "model": "ernie-4.5-turbo-vl-preview" } } } diff --git a/docs/models/qianfan.mdx b/docs/models/qianfan.mdx index 4d71593f..c3ac6132 100644 --- a/docs/models/qianfan.mdx +++ b/docs/models/qianfan.mdx @@ -28,6 +28,20 @@ description: 百度千帆 ERNIE 模型配置 | `ernie-4.5-turbo-32k` | 通用对话,成本和上下文更均衡 | | `ernie-x1-turbo-32k` | 需要更强推理能力的任务 | +## Vision 工具 + +配置 `qianfan_api_key` 后,Agent 的 Vision 工具可以自动使用千帆视觉模型。默认推荐使用 `ernie-4.5-turbo-vl-preview`: + +```json +{ + "tool": { + "vision": { + "model": "ernie-4.5-turbo-vl-preview" + } + } +} +``` + 方式二:OpenAI 兼容方式接入: ```json diff --git a/docs/tools/vision.mdx b/docs/tools/vision.mdx index 5ef55674..398fc579 100644 --- a/docs/tools/vision.mdx +++ b/docs/tools/vision.mdx @@ -19,6 +19,7 @@ Vision 工具采用多级自动选择 + 自动兜底策略,无需手动配置 | 厂商 | 视觉模型 | 说明 | | --- | --- | --- | | OpenAI / 兼容协议 | 使用主模型 | 支持所有 OpenAI 协议兼容的多模态模型 | +| 百度千帆 (Qianfan) | ernie-4.5-turbo-vl-preview | 配置 `qianfan_api_key` 后自动发现,也可通过 `tool.vision.model` 指定 | | 通义千问 (DashScope) | 使用主模型 | 例如 qwen3.6-plus 等 | | Claude | 使用主模型 | Anthropic 原生图像格式 | | Gemini | 使用主模型 | inlineData 格式 | @@ -41,7 +42,7 @@ Vision 工具采用多级自动选择 + 自动兜底策略,无需手动配置 { "tool": { "vision": { - "model": "gpt-4o" + "model": "ernie-4.5-turbo-vl-preview" } } } diff --git a/tests/test_qianfan_provider.py b/tests/test_qianfan_provider.py index 2e51224a..8b996d11 100644 --- a/tests/test_qianfan_provider.py +++ b/tests/test_qianfan_provider.py @@ -452,6 +452,7 @@ class TestQianfanDocs(unittest.TestCase): self.assertIn("qianfan_api_key", text) self.assertIn("https://qianfan.baidubce.com/v2", text) self.assertIn("ernie-4.5-turbo-128k", text) + self.assertIn("ernie-4.5-turbo-vl-preview", text) def test_model_indexes_link_qianfan(self): for path in ( @@ -469,6 +470,17 @@ class TestQianfanDocs(unittest.TestCase): self.assertIn('"qianfan_api_key": ""', text) self.assertIn('"qianfan_api_base": "https://qianfan.baidubce.com/v2"', text) + def test_vision_docs_document_qianfan_provider(self): + expected = { + "docs/tools/vision.mdx": "百度千帆", + "docs/en/tools/vision.mdx": "Baidu Qianfan", + "docs/ja/tools/vision.mdx": "Baidu Qianfan", + } + for path, label in expected.items(): + text = self._read(path) + self.assertIn(label, text) + self.assertIn("ernie-4.5-turbo-vl-preview", text) + if __name__ == "__main__": unittest.main()