# 浏览器与电脑控制

当代理需要 *使用* 你的机器，以人类的方式——打开页面、截取屏幕截图、点击按钮、输入短语——这些工具就是它实现这一切的方式。

## 浏览器

* **在嵌入式 webview 中打开** 一个 URL，供代理回读。
* **截图** 当前页面。
* **检查** 图像输出和元数据，这样代理就可以描述它所看到的内容。

浏览器界面通过 CEF（Chromium Embedded Framework，Chromium 嵌入式框架）运行，并包含一个安全层，用于限定页面可以执行的操作。请参阅 [Chromium Embedded Framework](/openhuman/zh/kai-fa/cef.md) 了解平台详情。

## 计算机（鼠标 + 键盘）

* **鼠标** - 移动、点击、拖拽。
* **键盘** - 输入文本，发送按键组合。
* **人类路径** - 移动和点击遵循类似人类的轨迹，而不是瞬移，因此不会触发简单的机器人检测。

## 适用场景

* 驱动那些没有 API 的网站，或 [原生集成](/openhuman/zh/gong-neng/integrations.md).
* 多步骤的 UI 流程，单张截图不够用。
* 在聊天中自动化本地应用。

## 另请参阅

* [网页抓取器](/openhuman/zh/gong-neng/native-tools/web-scraper.md) - 当你只需要文章，而不是整页时。
* [Chromium Embedded Framework](/openhuman/zh/kai-fa/cef.md) - 运行时浏览器层。


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://tinyhumans.gitbook.io/openhuman/zh/gong-neng/native-tools/browser-and-computer.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.