Skip to content

hkd987/remix-browser

Repository files navigation

remix-browser

Blazing fast headless Chrome automation via CDP — no extension needed.

Installation · Tools · Configuration · Architecture

CI


A Rust-native MCP server that gives AI agents full control over a real Chrome browser through the Chrome DevTools Protocol. No browser extensions, no Puppeteer, no Node.js — just a single binary that speaks CDP.

Benchmarks

Method Time Cost Turns Success
remix-browser 3m 35s $0.83 23 100%
Dev Browser 3m 53s $0.88 29 100%
Playwright MCP 4m 31s $1.45 51 100%
Playwright Skill 8m 07s $1.45 38 100%
Claude Code Native Chrome 12m 54s $2.81 80 100%

_See dev-browser-eval for methodology. The model used was sonnet 4.5

Why remix-browser?

remix-browser Extension-based MCPs Puppeteer wrappers
Startup Single binary, instant Requires browser extension install Node.js + npm install
Reliability Hybrid click strategy with automatic fallback Extension message passing Basic click only
Multi-tab Built-in tab pool management Limited by extension API Manual page tracking
Network capture First-class network monitoring Not available Requires extra setup
Console logs Built-in capture with filtering Not available Requires extra setup
Selectors CSS + Text + XPath CSS only CSS + XPath
Language Rust (fast, safe, single binary) JavaScript JavaScript

Installation

Claude Code Plugin (recommended)

/plugin marketplace add hkd987/remix-browser
/plugin install remix-browser@hkd987-remix-browser

That's it — the binary downloads automatically on first use. No Rust required.

Download pre-built binary

curl -fsSL https://raw.githubusercontent.com/hkd987/remix-browser/main/scripts/install.sh | sh

From source

git clone https://github.com/hkd987/remix-browser.git
cd remix-browser
cargo build --release

Requirements

  • Google Chrome or Chromium installed (auto-detected)
  • Rust 1.88+ only needed if building from source

Quick Start

Add to Claude Code

Add to your Claude Code MCP config (~/.claude/mcp.json):

{
  "mcpServers": {
    "remix-browser": {
      "command": "/path/to/remix-browser"
    }
  }
}

That's it. Claude now has a browser.

Headed mode (see what's happening)

{
  "mcpServers": {
    "remix-browser": {
      "command": "/path/to/remix-browser",
      "args": ["--headed"]
    }
  }
}

Connect to your existing browser

Chrome 144+ has a built-in toggle that lets any agent connect to your running browser — no extensions needed.

  1. Open chrome://inspect/#remote-debugging in Chrome
  2. Enable the remote debugging toggle
  3. Connect remix-browser:
{
  "mcpServers": {
    "remix-browser": {
      "command": "/path/to/remix-browser",
      "args": ["--cdp-url", "ws://127.0.0.1:9222"]
    }
  }
}

Or use an environment variable:

{
  "mcpServers": {
    "remix-browser": {
      "command": "/path/to/remix-browser",
      "env": {
        "CDP_URL": "ws://127.0.0.1:9222"
      }
    }
  }
}

HTTP URLs also work — the WebSocket URL is auto-discovered from /json/version:

CDP_URL=http://127.0.0.1:9222 remix-browser

When connected to an external browser, remix-browser uses your existing tabs and gracefully disconnects on exit without closing Chrome.

Best performance tip

For the best experience, add this line to your project's CLAUDE.md (or ~/.claude/CLAUDE.md for all projects):

When I ask to use Chrome or browser automation, use remix-browser MCP tools.
For 1-2 simple actions, granular tools are fine.
For workflows with 3+ actions, loops, or extraction, prefer `run_script`.
Use `fill` for setting any form control — it auto-detects input type (text, select, checkbox, range slider).
Snapshots auto-append after every tool call, so [ref=eN] selectors are always fresh.

This tells Claude to automatically reach for remix-browser whenever you mention browser tasks — no need to say "remix-browser" by name.

Performance Usage Pattern

  • Use granular tools for short flows (navigate -> click -> get_text).
  • Use run_script for multi-step workflows, loops, and repeated extraction.
  • Use fill instead of type_text + select_option — it auto-detects the control type (text, select, checkbox, range slider, ARIA slider).
  • Snapshots auto-append after every tool call, so [ref=eN] selectors are always available without a separate snapshot call.
  • All interaction tools (click, type_text, fill) include auto-wait — they poll up to 5 seconds for the element to appear before acting, eliminating timing errors on dynamic pages.

Tools

remix-browser exposes a broad toolset organized by category.

Navigation

Tool Description
navigate Go to a URL. Supports load, domcontentloaded, and networkidle wait strategies.
go_back Navigate back in history.
go_forward Navigate forward in history.
reload Reload the current page.
get_page_info Get current URL, title, and viewport dimensions.

Finding Elements

Tool Description
find_elements Find elements by CSS selector, text content, or XPath. Returns tag, text, attributes, and node IDs.
get_text Extract text content from a matched element.
get_html Get inner or outer HTML of the page or a specific element.
wait_for Wait for an element to appear, disappear, or become visible. Configurable timeout.

Snapshot

Tool Description
snapshot Return a compact list of interactive elements with stable refs like [ref=e0] that can be reused in selectors.

Interaction

Tool Description
click Click elements using a hybrid strategy — tries real mouse events first, falls back to JS dispatch if the element is obscured. Auto-waits up to 5s for the element to appear.
type_text Type into input fields. Optionally clear existing content first. Auto-waits up to 5s for the element to appear.
fill Smart form control setter — auto-detects input type and sets the value appropriately. Works with text inputs, textareas, <select>, checkboxes, input[type=range] sliders, and ARIA role="slider" elements.
hover Hover over elements (fires mouseenter, mouseover, mousemove).
select_option Select an option in a <select> dropdown by value.
press_key Press keyboard keys (Enter, Tab, ArrowDown, etc.) with optional modifiers.
scroll Scroll the page or a specific element in any direction.

Screenshots

Tool Description
screenshot Capture the viewport, full page, or a specific element as base64 PNG/JPEG.

JavaScript & Console

Tool Description
execute_js Run arbitrary JavaScript and get the result back as JSON.
read_console Read captured console.log/warn/error output. Filter by level or regex pattern.

Network Monitoring

Tool Description
network_enable Start capturing network requests. Optionally filter by URL patterns.
get_network_log Query captured requests by URL pattern, HTTP method, or status code. Includes timing data.

Tab Management

Tool Description
new_tab Open a new tab, optionally navigating to a URL.
close_tab Close a specific tab or the active one.
list_tabs List all open tabs with their URLs and titles.

Script Automation

Tool Description
run_script Execute multi-step browser automation in one tool call with a synchronous page.* API. Includes page.fill(), page.click(), page.type(), page.js(), and more. [ref=eN] selectors auto-resolve inside page.js() expressions. Best for loops, repeated actions, and extraction workflows.

Selector Types

All element-targeting tools support three selector strategies:

CSS (default):  "button.submit", "#login-form", "div > p:first-child"
Text:           "Sign In", "Submit Order", "Click here"
XPath:          "//button[@type='submit']", "//div[contains(@class, 'menu')]"

Text selectors use a TreeWalker to find elements by their visible text content — no need to inspect the DOM to find the right CSS class.

The Hybrid Click

Most browser automation tools fail on modern JS-heavy sites. Dropdown menus, overlays, dynamically positioned elements — they all break simple element.click().

remix-browser uses a hybrid click strategy:

  1. Auto-wait up to 5 seconds for the element to appear in the DOM
  2. Scroll the element into view
  3. Check visibility and whether it's obscured by other elements
  4. Dispatch real mouse events (mousedown -> mouseup -> click) at the element's coordinates
  5. If the element is obscured (e.g., behind an overlay), automatically fall back to JavaScript click()
  6. Report which method was used so you know exactly what happened

This means clicks just work — even on sites with complex overlays, sticky headers, dynamic menus, and late-loading elements.

Configuration

Option Default Description
--headed false Show the browser window instead of running headless
--cdp-url Connect to an existing browser via CDP WebSocket or HTTP URL
CDP_URL env var Same as --cdp-url (env var alternative)
RUST_LOG env var info Control log verbosity (debug, trace, etc.)

Chrome Detection

remix-browser automatically finds Chrome on your system:

  • macOS: /Applications/Google Chrome.app, Homebrew paths, Chrome Canary
  • Linux: /usr/bin/google-chrome, snap packages, Chromium
  • Windows: Program Files, Local AppData

Falls back to which google-chrome / which chromium if standard paths don't exist.

Default Browser Settings

  • Viewport: 1280x720
  • Headless mode: --headless=new (Chrome's latest headless implementation)
  • Extensions, sync, popups, and first-run prompts are all disabled for a clean automation environment

Architecture

src/
├── main.rs                # CLI entry point
├── server.rs              # MCP ServerHandler — routes tool calls
├── browser/
│   ├── session.rs         # Browser lifecycle management
│   ├── pool.rs            # Multi-tab tracking (TabPool)
│   └── launcher.rs        # Chrome binary detection & launch config
├── tools/
│   ├── navigation.rs      # navigate, go_back, go_forward, reload
│   ├── dom.rs             # find_elements, get_text, get_html, wait_for
│   ├── interaction.rs     # click, type_text, fill, hover, select_option, press_key, scroll
│   ├── screenshot.rs      # screenshot capture
│   ├── snapshot.rs        # compact interactive tree + ref generation
│   ├── javascript.rs      # execute_js, console log capture
│   ├── network.rs         # network monitoring
│   ├── page.rs            # tab management
│   └── script.rs          # run_script JS engine and page API
├── interaction/
│   ├── click.rs           # Hybrid click strategy implementation
│   ├── keyboard.rs        # Key press & text input
│   ├── scroll.rs          # Scroll logic
│   └── wait.rs            # Auto-wait polling for element existence
└── selectors/
    ├── mod.rs             # Selector normalization & :has-text() conversion
    ├── css.rs             # CSS selector resolution
    ├── text.rs            # Text content matching via TreeWalker
    ├── xpath.rs           # XPath evaluation
    └── ref.rs             # [ref=eN] snapshot reference resolution

Built on:

Testing

# Run all 50 integration tests (uses real headless Chrome)
cargo test -- --test-threads=4

# Run a specific test
cargo test test_navigate

Tests use fixture HTML files and spin up isolated Chrome instances with unique profiles — no shared state, no flakiness.

License

MIT

About

A Rust MCP server for headless Chrome automation via CDP — built as a Claude Code plugin.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors