Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
118 changes: 91 additions & 27 deletions skills/browser/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ name: browser
description: Automate web browser interactions using natural language via CLI commands. Use when the user asks to browse websites, navigate web pages, extract data from websites, take screenshots, fill forms, click buttons, or interact with web applications. Supports remote Browserbase sessions with automatic CAPTCHA solving, anti-bot stealth mode, and residential proxies — ideal for scraping protected websites, bypassing bot detection, and interacting with JavaScript-heavy pages.
compatibility: "Requires the browse CLI (`npm install -g @browserbasehq/browse-cli`). Remote Browserbase sessions need `BROWSERBASE_API_KEY`. Local mode uses Chrome/Chromium on your machine."
license: MIT
allowed-tools: Bash
allowed-tools: Bash Read Write Edit Glob
metadata:
openclaw:
requires:
Expand All @@ -20,43 +20,62 @@ metadata:

Automate browser interactions using the browse CLI with Claude.

## Setup check
## Step 1 — BEFORE doing anything else: Setup + Memory Check

Before running any browser commands, verify the CLI is available:
Run this as your VERY FIRST action for any browsing task. Do NOT run `browse open` before this:

```bash
which browse || npm install -g @browserbasehq/browse-cli
mkdir -p ${CLAUDE_SKILL_DIR}/memory
echo "=== SITE MEMORY ==="
cat ${CLAUDE_SKILL_DIR}/memory/MEMORY_FILE.md 2>/dev/null || echo "NO MEMORY — will need to snapshot after opening"
```

## Environment Selection (Local vs Remote)
Replace MEMORY_FILE with the domain, using dashes for dots/colons/slashes (e.g., `news-ycombinator-com`, `github-com`, `localhost-3000`).

**If site memory exists**: You have selectors from previous visits. After `browse open`, use them IMMEDIATELY. Do NOT run `browse snapshot`. Example:

```bash
# Memory says story titles use ".titleline > a" — use it directly:
browse open https://news.ycombinator.com
browse click ".titleline > a" # ← use cached selector, NO snapshot
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Memory feature uses CSS selectors with ref-only click command

High Severity

The memory feature's core optimization — caching CSS selectors and using them directly to skip snapshots — is built on the assumption that browse click accepts CSS selectors. However, per the REFERENCE.md and every example in EXAMPLES.md, browse click only accepts snapshot refs (e.g., @0-5), not CSS selectors. The example browse click ".titleline > a" will fail. The memory system instructs recording "stable selectors" like input[name="email"] and [data-testid="..."], but these cannot be used with the primary interaction command.

Additional Locations (2)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit efee8bf. Configure here.

```

Only run `browse snapshot` if a cached selector FAILS (returns an error). Trust the memory.

**If no memory exists**: After `browse open`, use `browse snapshot` to discover the page.

## Step 2 — Browse

### Environment Selection (Local vs Remote)

The CLI supports explicit per-session environment overrides. If you do nothing, the next session defaults to Browserbase when `BROWSERBASE_API_KEY` is set and to local otherwise.

### Local mode
#### Local mode
- `browse env local` starts a clean isolated local browser
- `browse env local --auto-connect` reuses an already-running debuggable Chrome and falls back to isolated if nothing is available
- `browse env local <port|url>` attaches to a specific CDP target
- Best for: development, localhost, trusted sites, and reproducible runs

### Remote mode (Browserbase)
#### Remote mode (Browserbase)
- `browse env remote` switches the current session to Browserbase
- Without a local override, Browserbase is also the default when `BROWSERBASE_API_KEY` is set
- Provides: anti-bot stealth, automatic CAPTCHA solving, residential proxies, session persistence
- **Use remote mode when:** the target site has bot detection, CAPTCHAs, IP rate limiting, Cloudflare protection, or requires geo-specific access
- Get credentials at https://browserbase.com/settings

### When to choose which
#### When to choose which
- **Repeatable local testing / clean state**: `browse env local`
- **Reuse your local login/cookies**: `browse env local --auto-connect`
- **Simple browsing** (docs, wikis, public APIs): local mode is fine
- **Protected sites** (login walls, CAPTCHAs, anti-scraping): use remote mode
- **If local mode fails** with bot detection or access denied: switch to remote mode

## Commands
### Commands

All commands work identically in both modes. The daemon auto-starts on first command.

### Navigation
#### Navigation
```bash
browse open <url> # Go to URL (aliases: goto)
browse open <url> --context-id <id> # Load Browserbase context (remote only)
Expand All @@ -66,7 +85,7 @@ browse back # Go back in history
browse forward # Go forward in history
```

### Page state (prefer snapshot over screenshot)
#### Page state (prefer snapshot over screenshot)
```bash
browse snapshot # Get accessibility tree with element refs (fast, structured)
browse screenshot [path] # Take visual screenshot (slow, uses vision tokens)
Expand All @@ -79,7 +98,7 @@ browse get value <selector> # Get form field value

Use `browse snapshot` as your default for understanding page state — it returns the accessibility tree with element refs you can use to interact. Only use `browse screenshot` when you need visual context (layout, images, debugging).

### Interaction
#### Interaction
```bash
browse click <ref> # Click element by ref from snapshot (e.g., @0-5)
browse type <text> # Type text into focused element
Expand All @@ -94,7 +113,7 @@ browse is checked <selector> # Check if element is checked
browse wait <type> [arg] # Wait for: load, selector, timeout
```

### Session management
#### Session management
```bash
browse stop # Stop the browser daemon (also clears env override)
browse status # Check daemon status (includes env)
Expand All @@ -108,23 +127,66 @@ browse tab_switch <index> # Switch to tab by index
browse tab_close [index] # Close tab
```

### Typical workflow
## Step 3 — AFTER completing the task: Save Site Memory

You MUST do this after every browsing task. Do NOT skip it.

Use the Write tool to create or update `${CLAUDE_SKILL_DIR}/memory/<domain>.md`:

```markdown
# <domain>

Last updated: <YYYY-MM-DD>

## <page-path> — <short description>

### Elements
- **<Element Name>**: `<selector>` — <what it does>

### Patterns
- <Multi-step flow that works>

### Notes
- <Gotchas, async loading, timing>
```

Rules for memory files:
- Record **stable selectors**: `input[name="email"]`, `[data-testid="..."]`, `button[type="submit"]` — NOT just snapshot refs like `@0-5`
- Use **URL patterns**: `/users/:id` not `/users/123` when pages share structure
- Note **async behavior**: "table loads after ~2s", "button disabled until form valid"
- Be **generous**: record all interactive elements, not just the ones you used
- If a cached selector failed, **update it** with the working one you found

## Typical Workflow (all 3 steps)

If the environment matters, set it first with `browse env local`, `browse env local --auto-connect`, or `browse env remote`.

1. `browse open <url>` — navigate to the page
2. `browse snapshot` — read the accessibility tree to understand page structure and get element refs
3. `browse click <ref>` / `browse type <text>` / `browse fill <selector> <value>` — interact using refs from snapshot
4. `browse snapshot` — confirm the action worked
5. Repeat 3-4 as needed
6. `browse stop` — close the browser when done
1. **Read site memory** (MANDATORY): `cat ${CLAUDE_SKILL_DIR}/memory/<domain>.md`
2. `browse open <url>` — navigate to the page
3. If memory had selectors, use them directly. Otherwise: `browse snapshot`
4. Interact: `browse click` / `browse type` / `browse fill`
5. `browse snapshot` to confirm (if needed)
6. Repeat 3-5
7. **Write site memory** (MANDATORY): update `${CLAUDE_SKILL_DIR}/memory/<domain>.md`
8. `browse stop` — close browser when done

## Quick Example

```bash
# STEP 1: Setup + memory check
which browse || npm install -g @browserbasehq/browse-cli
mkdir -p ${CLAUDE_SKILL_DIR}/memory
cat ${CLAUDE_SKILL_DIR}/memory/example-com.md 2>/dev/null || echo "NO MEMORY"

# STEP 2: Browse
browse open https://example.com
browse snapshot # see page structure + element refs
browse click @0-5 # click element with ref 0-5
browse snapshot # only if no memory, or cached selectors failed
browse click @0-5
browse get title

# STEP 3: Save memory (use Write tool)
# Write ${CLAUDE_SKILL_DIR}/memory/example-com.md with elements + patterns

browse stop
```

Expand All @@ -143,19 +205,21 @@ browse stop

## Best Practices

1. **Choose the local strategy deliberately**: use `browse env local` for clean state, `browse env local --auto-connect` for existing local credentials, and `browse env remote` for protected sites
2. **Always `browse open` first** before interacting
3. **Use `browse snapshot`** to check page state — it's fast and gives you element refs
4. **Only screenshot when visual context is needed** (layout checks, images, debugging)
5. **Use refs from snapshot** to click/interact — e.g., `browse click @0-5`
6. **`browse stop`** when done to clean up the browser session and clear the env override
1. **ALWAYS read site memory before browsing** — this is not optional
2. **ALWAYS write site memory after browsing** — this is not optional
3. **Choose the local strategy deliberately**: `browse env local` for clean state, `--auto-connect` for existing credentials, `remote` for protected sites
4. **Use `browse snapshot`** only when no memory exists or cached selectors fail
5. **Only screenshot when visual context is needed** (layout checks, images, debugging)
6. **Use refs from snapshot** to click/interact — e.g., `browse click @0-5`
7. **`browse stop`** when done to clean up the browser session and clear the env override

## Troubleshooting

- **"No active page"**: Run `browse stop`, then check `browse status`. If it still says running, kill the zombie daemon with `pkill -f "browse.*daemon"`, then retry `browse open`
- **Chrome not found**: Install Chrome, use `browse env local --auto-connect` if you already have a debuggable Chrome running, or switch to `browse env remote`
- **Action fails**: Run `browse snapshot` to see available elements and their refs
- **Browserbase fails**: Verify API key is set
- **Cached selector fails**: Take a fresh `browse snapshot`, find the updated selector, update the memory file

## Switching to Remote Mode

Expand Down