GitHub Repository

Deterministic browser automation. Works out of the box with Claude/Codex/OpenCode

474 starsC++

Open-source browser for AI agents (~90% on Mind2Web)

Name: Open-source browser for AI agents (~90% on Mind2Web)
Availability: InStock
Author: theredsix

by theredsix·Mar 11, 2026·155 points·55 comments

Visit Project View on HN

AI Analysis

●●●BangerWizardryBig BrainZero to One

Forked Chromium to freeze execution state—solves the stale-state problem that breaks most browser agents.

Strengths

•Freezes JavaScript execution after each action, eliminating race conditions in agent reasoning.
•Captures structured event summaries alongside screenshots for multimodal agent loops.
•~90% on Mind2Web benchmark shows real performance on established evaluation.

Weaknesses

•Forking Chromium means heavy maintenance burden as upstream evolves.
•Browser automation for agents is emerging—Browser Use and Playwright compete.

Post Description

Hi HN, I forked chromium and built agent-browser-protocol (ABP) after noticing that most browser-agent failures aren’t really about the model misunderstanding the page. Instead, the problem is that the model is reasoning from a stale state.

ABP is designed to keep the acting agent synchronized with the browser at every step. After each action (click, type, etc), it freezes JavaScript execution and rendering, then captures the resulting state. It also compiles the notable events that occurred during that action loop, such as navigation, file pickers, permission prompts, alerts, and downloads, and sends that along with a screenshot of the frozen page state back to the agent.

The result is that browser interaction starts to feel more like a multimodal chat loop. The agent takes an action, gets back a fresh visual state and a structured summary of what happened, then decides what to do next from there. That fits much better with how LLMs already work.

A few common browser-use failures ABP helps eliminate: * A modal appears after the last Playwright screenshot and blocks the input the agent was about to use * Dynamic filters cause the page to reflow between steps * An autocomplete dropdown opens and covers the element the agent intended to click * alert() / confirm() interrupts the flow * Downloads are triggered, but the agent has no reliable way to know when they’ve completed

As proof, ABP with opus 4.6 as the driver scores 90.5% on the Online Mind2Web benchmark. I think modern LLMs already understand websites, they just need a better tool to interact with them. Happy to answer questions about the architecture, forking chrome or anything else in the comments below.

Try it out: `claude mcp add browser -- npx -y agent-browser-protocol --mcp` (Codex/OpenCode instructions in the docs)

Demo video: https://www.loom.com/share/387f6349196f417d8b4b16a5452c3369

Similar Projects

Developer Tools●●Solid

Clark-Browser – Stealth Chromium

Chromium fork patched at C++ source level to bypass bot detection without JS shims.

Niche GemWizardrySolve My Problem

stan_kirdey

16427d ago

AI/ML●●Solid

Agent Action Guard – AI agent action safety

HarmActionsEval benchmark proves GPT and Claude fail at blocking harmful tool use.

Solve My ProblemNiche Gem

praneeth-v

202mo ago

AI/ML●●●Banger

DashClaw – Intercept AI agent actions before they execute

Control before execution beats observability after—HITL with 10-min replay window.

Solve My ProblemBig BrainSlick

ucsandman

113mo ago

Infrastructure●●●Banger

Cycles – hard limits on agent actions before execution

Reserve-commit lifecycle blocks agent actions before execution, unlike standard rate limiters.

Solve My ProblemBig BrainShip It

amavashev

122mo ago

AI/ML●Mid

Deterministic browser control for AI agents (~90% on Mind2Web)

Deterministic browser steps for agent reasoning, but README is just Chromium boilerplate with no substantive implementation details.

Big Brain

theredsix

1273mo ago

Developer Tools●●Solid

ContextSubstrate – Capture, diff, replay AI agent runs (Git agent work)

Git for AI agent runs—pack, diff, replay, and verify agent work with content addressing.

Big BrainWizardry

scalefirst

113mo ago