以下内容基本上是 AI 生成的，我还没校对，可能质量不高

autoplay¶

autoplay 是一个试验性浏览器代理：你给任务描述，它会生成 Playwright 的 Python 代码并执行，反复观察页面、调整动作，直到完成目标。

https://github.com/CNSeniorious000/autoplay

核心思路并不复杂：用 reasonify 做任务分解与推理，prompt 约束输出为纯可执行代码（没有多余说明），运行时把生成的代码在受控环境里执行并收集结果。

为什么用原生代码输出

直接产出 Playwright 代码的好处是避免再造一套 DSL，调试也更直观——你可以把生成结果直接拿到 REPL 里跑。

双轨执行架构¶

autoplay 有两条独立的执行路径，各有所长。

`step.py` —— 交互式代码注入循环¶

这是传统的"生成—执行—观察"循环：

# src/core/step.py:44-58
class Session:
    def __init__(self):
        self.messages: list[ExtendedMessage] = [{"role": "system", "content": prompt.render()}]
        context = ExecContext()
        self.run_code = context.run
        self.namespace = context.namespace #(1)!

        @context.register
        @wraps(input)
        def _(prompt: str):
            return Prompt.ask(f"[cyan]{prompt}") #(2)!

ExecContext：useful-coderunner 提供的沙箱，允许执行任意 Python 代码并返回结果
input 劫持：用 rich.prompt.Prompt 替代原生 input()，提供更好的 CLI 体验

关键机制 —— 页面状态捕获：

# src/core/step.py:85-107
async def step(self, message=""):
    messages = self.messages.copy()
    for key, value in self.namespace.items():
        if isinstance(value, Page) and not value.is_closed():
            url, title, content, image_url, html = await get_page_information(value) #(1)!
            messages.append({
                "role": "user",
                "content": [
                    {"type": "text", "text": f"`{key}` 是一个 playwright 的 Page，这是它的整页截图："},
                    {"type": "image_url", "image_url": {"url": image_url, "detail": "low"}}, #(2)!
                    {"type": "text", "text": f"标题: {title}, URL: {url}, 内容: {content}, HTML: {html}"},
                ],
            })

每个活跃的 Page 都会被压缩成：截图（低分辨率 base64）+ 文本内容（readability 提取）+ 清洗后的 DOM
detail: "low" 降低图片的 token 消耗，因为不需要让 LLM "看清"页面细节，只需要感知布局

`new.py` —— reasonify 工具链模式¶

这条路径更安全，但限制了 LLM 只能调用预定义工具：

# src/core/new.py:23-47
@tool
async def get_page_from_url(
    context: BrowserContext,
    url: str,
    wait_until: Literal["commit", "domcontentloaded", "load", "networkidle"] = "domcontentloaded",
) -> Page:
    """Get a playwright async api page object. Useful when you need to search something."""
    page = await context.new_page()
    await page.goto(url, wait_until=wait_until)
    return page

@tool
async def new_playwright_context() -> BrowserContext: #(1)!
    """Get a playwright async api context object. Use this when the user ask for any web scraping job."""
    from ..utils.browser import new_context
    return await new_context()

工具注册：reasonify 的 @tool 装饰器让 LLM 可以调用这些函数，而不是直接执行任意代码

`step.py`	`new.py`
任意 Python	`@tool` 标记的函数
自动截图 + DOM	需显式调用工具
完全开放	受限工具集
`useful-coderunner`	`reasonify-headless`

DOM 清洗¶

src/core/parse.py 的 get_cleaned_dom 负责清洗 HTML：

# src/core/parse.py:10-33
def get_cleaned_dom(html: str):
    html = replace_uuids(html)  # blob:[UUID] -> [BLOB]
    html = replace_blobs(html)  # 随机 UUID -> [UUID]

    dom = Selector(html)

    dom.css("script, noscript, iframe, style, img[src^='data:'], svg, source, link").drop() #(1)!

    for i in dom.css("[style]"):
        remove_attrib(i, "style") #(2)!

    for i in dom.css("*"):
        for attr in i.attrib:
            if attr not in ("class", "id", "role", "src", "alt", "title") and not attr.startswith("aria-"): #(3)!
                remove_attrib(i, attr)

    return dom

删除无用标签——script、noscript 等不影响语义理解
删除内联样式——style 属性通常很长但对理解页面结构帮助不大
只保留语义属性——role 和 aria-* 更容易被 LLM 理解，也对无障碍访问很重要

整体思路是去除噪声（data-*、onclick 等）、保留语义，并将 UUID/blob URL 归一化以避免不必要的 diff。

深入洞见¶

源码和模板清楚表明 autoplay 的设计是把 LLM 输出当作可执行 Playwright Python 片段：src/templates/prompt.j2 强制模型只返回纯 Python（Playwright Async API），src/core/step.py / src/core/new.py 会把 LLM 输出注入到 ExecContext 并运行。实现使用 reasonify 把推理组织成 chain/loop，这让系统能在生成后把结果直接执行并把运行结果反馈给模型，形成“生成—执行—修正”的闭环，但也带来明显的安全与 sandbox 风险。

参考源码（GraphQL 验证）：src/templates/prompt.j2（生成 Playwright 代码） · src/core/step.py（ExecContext / Session 执行）

autoplay¶

双轨执行架构¶

step.py —— 交互式代码注入循环¶

new.py —— reasonify 工具链模式¶

DOM 清洗¶

深入洞见¶

`step.py` —— 交互式代码注入循环¶

`new.py` —— reasonify 工具链模式¶