mirror of
https://github.com/zhayujie/chatgpt-on-wechat.git
synced 2026-06-02 09:48:22 +08:00
docs: make English the default docs language and fix link paths
This commit is contained in:
@@ -1,42 +1,42 @@
|
||||
---
|
||||
title: browser - 浏览器
|
||||
description: 控制浏览器访问和操作网页
|
||||
title: browser - Browser
|
||||
description: Control a browser to access and interact with web pages
|
||||
---
|
||||
|
||||
控制 Chromium 浏览器进行网页导航、元素交互和内容提取。支持 JavaScript 渲染的动态页面,使用精简 DOM 快照让 Agent 高效理解页面结构。
|
||||
Control a Chromium browser for web navigation, element interaction and content extraction. Supports JavaScript-rendered pages and uses a compact DOM snapshot so the Agent can efficiently understand page structure.
|
||||
|
||||
## 安装
|
||||
## Installation
|
||||
|
||||
<Tabs>
|
||||
<Tab title="CLI 安装(推荐)">
|
||||
<Tab title="CLI install (recommended)">
|
||||
```bash
|
||||
cow install-browser
|
||||
```
|
||||
|
||||
该命令会自动完成:
|
||||
- 安装 `playwright` Python 包(旧系统自动降级兼容版本)
|
||||
- 在 Linux 上安装系统依赖
|
||||
- 下载 Chromium 浏览器(Linux 服务器自动使用无头精简版)
|
||||
- 自动检测国内网络并使用镜像加速
|
||||
This command will:
|
||||
- Install the `playwright` Python package (with auto-fallback for older systems)
|
||||
- Install system dependencies on Linux
|
||||
- Download the Chromium browser (Linux servers automatically use the headless build)
|
||||
- Detect China-mainland networks and use mirror acceleration
|
||||
</Tab>
|
||||
<Tab title="手动安装">
|
||||
<Tab title="Manual install">
|
||||
```bash
|
||||
pip install playwright
|
||||
playwright install chromium
|
||||
```
|
||||
|
||||
Linux 服务器还需安装系统依赖:
|
||||
On Linux servers, install system dependencies as well:
|
||||
```bash
|
||||
sudo playwright install-deps chromium
|
||||
```
|
||||
|
||||
如果系统较旧(如 Ubuntu 18.04,glibc < 2.28),需安装兼容版本:
|
||||
On older systems (e.g. Ubuntu 18.04, glibc < 2.28), install a compatible version:
|
||||
```bash
|
||||
pip install playwright==1.28.0
|
||||
python -m playwright install chromium
|
||||
```
|
||||
|
||||
国内网络下载 Chromium 较慢,可设置镜像加速:
|
||||
To accelerate the Chromium download from China:
|
||||
```bash
|
||||
export PLAYWRIGHT_DOWNLOAD_HOST=https://registry.npmmirror.com/-/binary/playwright
|
||||
python -m playwright install chromium
|
||||
@@ -45,55 +45,55 @@ description: 控制浏览器访问和操作网页
|
||||
</Tabs>
|
||||
|
||||
<Note>
|
||||
1. 支持 Ubuntu 20.04+、Debian 10+、macOS、Windows。Ubuntu 18.04 等旧系统会自动降级安装兼容版本。
|
||||
2. 浏览器工具依赖较重(约300MB),为可选安装。轻量的网页内容获取可使用 `web_fetch` 工具。
|
||||
1. Supported on Ubuntu 20.04+, Debian 10+, macOS and Windows. Older systems such as Ubuntu 18.04 will fall back to a compatible version automatically.
|
||||
2. The browser tool has heavy dependencies (~300MB) and is optional. For lightweight web content retrieval, use the `web_fetch` tool.
|
||||
</Note>
|
||||
|
||||
## 工作流程
|
||||
## Workflow
|
||||
|
||||
Agent 使用浏览器的典型流程:
|
||||
A typical browser workflow for the Agent:
|
||||
|
||||
1. **`navigate`** — 打开目标 URL
|
||||
2. **`snapshot`** — 获取页面精简 DOM,交互元素自动编号(ref)
|
||||
3. **`click` / `fill` / `select`** — 通过 ref 编号操作元素
|
||||
4. **`snapshot`** — 再次快照验证操作结果
|
||||
1. **`navigate`** — Open the target URL
|
||||
2. **`snapshot`** — Get a compact DOM with auto-numbered interactive elements (`ref`)
|
||||
3. **`click` / `fill` / `select`** — Operate elements by `ref`
|
||||
4. **`snapshot`** — Snapshot again to verify the result
|
||||
|
||||
## 支持的操作
|
||||
## Supported Actions
|
||||
|
||||
| 操作 | 说明 | 关键参数 |
|
||||
| Action | Description | Key parameters |
|
||||
| --- | --- | --- |
|
||||
| `navigate` | 打开 URL | `url` |
|
||||
| `snapshot` | 获取页面结构化文本(主要方式) | `selector`(可选) |
|
||||
| `click` | 点击元素 | `ref` 或 `selector` |
|
||||
| `fill` | 填入文本 | `ref` 或 `selector`,`text` |
|
||||
| `select` | 下拉选择 | `ref` 或 `selector`,`value` |
|
||||
| `scroll` | 滚动页面 | `direction`(up/down/left/right) |
|
||||
| `screenshot` | 截图保存到工作区 | `full_page` |
|
||||
| `wait` | 等待元素或超时 | `selector`,`timeout` |
|
||||
| `press` | 按键(Enter、Tab 等) | `key` |
|
||||
| `back` / `forward` | 浏览器前进/后退 | - |
|
||||
| `get_text` | 获取元素文本内容 | `selector` |
|
||||
| `evaluate` | 执行 JavaScript | `script` |
|
||||
| `navigate` | Open URL | `url` |
|
||||
| `snapshot` | Get structured page text (primary way) | `selector` (optional) |
|
||||
| `click` | Click an element | `ref` or `selector` |
|
||||
| `fill` | Fill text into an input | `ref` or `selector`, `text` |
|
||||
| `select` | Select a dropdown option | `ref` or `selector`, `value` |
|
||||
| `scroll` | Scroll the page | `direction` (up/down/left/right) |
|
||||
| `screenshot` | Save a screenshot to the workspace | `full_page` |
|
||||
| `wait` | Wait for an element or timeout | `selector`, `timeout` |
|
||||
| `press` | Press a key (Enter, Tab, etc.) | `key` |
|
||||
| `back` / `forward` | Browser back / forward | - |
|
||||
| `get_text` | Get an element's text content | `selector` |
|
||||
| `evaluate` | Run JavaScript | `script` |
|
||||
|
||||
## 使用场景
|
||||
## Use Cases
|
||||
|
||||
- 访问指定 URL 获取动态页面内容
|
||||
- 填写表单、登录操作
|
||||
- 操作网页元素(点击按钮、选择选项等)
|
||||
- 验证部署后的网页效果
|
||||
- 抓取需要 JS 渲染的动态内容
|
||||
- Access a URL to retrieve dynamic page content
|
||||
- Fill in forms and log in
|
||||
- Operate web elements (click buttons, select options, etc.)
|
||||
- Verify the result of a deployed web page
|
||||
- Scrape content that requires JS rendering
|
||||
|
||||
## 运行模式
|
||||
## Run Mode
|
||||
|
||||
浏览器会根据运行环境自动选择模式:
|
||||
The browser picks a mode based on the runtime environment:
|
||||
|
||||
| 环境 | 模式 |
|
||||
| Environment | Mode |
|
||||
| --- | --- |
|
||||
| macOS / Windows | 有头模式(显示浏览器窗口) |
|
||||
| Linux 桌面(有 DISPLAY) | 有头模式 |
|
||||
| Linux 服务器(无 DISPLAY) | 无头模式(headless) |
|
||||
| macOS / Windows | Headed (browser window visible) |
|
||||
| Linux desktop (with DISPLAY) | Headed |
|
||||
| Linux server (no DISPLAY) | Headless |
|
||||
|
||||
可在 `config.json` 中手动覆盖:
|
||||
You can override it in `config.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
@@ -105,15 +105,15 @@ Agent 使用浏览器的典型流程:
|
||||
}
|
||||
```
|
||||
|
||||
## 登录态持久化
|
||||
## Persistent Login
|
||||
|
||||
**只需登录一次目标网站,Agent 后续可直接使用**。提供两种方式:
|
||||
**Log in to a target site once and the Agent can keep using it.** Two ways are supported:
|
||||
|
||||
### 方式一:Persistent 模式(默认)
|
||||
### Option 1: Persistent mode (default)
|
||||
|
||||
开箱即用,登录信息保存在 `~/.cow/browser_profile`。无需任何配置。
|
||||
Works out of the box. Login state is saved under `~/.cow/browser_profile`. No configuration needed.
|
||||
|
||||
如需关闭持久化模式,每次都用纯净环境:
|
||||
To disable persistence and start with a clean environment every time:
|
||||
|
||||
```json
|
||||
{
|
||||
@@ -125,11 +125,11 @@ Agent 使用浏览器的典型流程:
|
||||
}
|
||||
```
|
||||
|
||||
### 方式二:CDP 模式(接管真实 Chrome)
|
||||
### Option 2: CDP mode (attach to real Chrome)
|
||||
|
||||
让 Agent 连接独立启动的真实 Chrome(而非 Playwright 自带的 Chromium),获得完整浏览器指纹,适合反爬严格的网站。
|
||||
Have the Agent connect to a separately launched real Chrome (instead of the Chromium bundled with Playwright) for full browser fingerprints. Useful for sites with strict bot detection.
|
||||
|
||||
启动 Chrome 时加上调试端口和独立用户目录:
|
||||
Launch Chrome with a debugging port and a dedicated user data directory:
|
||||
|
||||
<Tabs>
|
||||
<Tab title="macOS">
|
||||
@@ -155,7 +155,7 @@ Agent 使用浏览器的典型流程:
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
在 `config.json` 中配置端点:
|
||||
Then point the Agent at the endpoint in `config.json`:
|
||||
|
||||
```json
|
||||
{
|
||||
@@ -168,5 +168,5 @@ Agent 使用浏览器的典型流程:
|
||||
```
|
||||
|
||||
<Note>
|
||||
Chrome 137+ 限制 `--remote-debugging-port` 必须搭配独立 `--user-data-dir`,因此 CDP 启动的 Chrome **无法直接复用你日常 Chrome 的登录态**,需要在独立目录中重新登录一次。
|
||||
Chrome 137+ requires `--remote-debugging-port` to be paired with a dedicated `--user-data-dir`. As a result, the CDP-launched Chrome **cannot directly reuse the login state of your daily Chrome**; you'll need to log in once inside this dedicated profile.
|
||||
</Note>
|
||||
|
||||
Reference in New Issue
Block a user