browserbase · shubh24 · Apr 7, 2026 · Apr 7, 2026 · Apr 7, 2026 · Apr 7, 2026
diff --git a/skills/autobrowse/.env.example b/skills/autobrowse/.env.example
@@ -0,0 +1,3 @@
+ANTHROPIC_API_KEY=sk-ant-...
+BROWSERBASE_API_KEY=bb_live_...
+BROWSERBASE_PROJECT_ID=your-project-id
diff --git a/skills/autobrowse/.gitignore b/skills/autobrowse/.gitignore
@@ -0,0 +1,6 @@
+node_modules/
+.env
+tasks/
+traces/
+*.log
+.DS_Store
diff --git a/skills/autobrowse/EXAMPLES.md b/skills/autobrowse/EXAMPLES.md
@@ -0,0 +1,91 @@
+# AutoBrowse Examples
+
+## Single task — interactive loop
+
+```
+/autobrowse --task my-portal
+```
+
+Run the evaluate → read trace → improve strategy cycle for one task. Claude iterates until you stop it or the task graduates.
+
+---
+
+## Single task — remote mode
+
+For sites with bot detection (login walls, CAPTCHAs, Cloudflare):
+
+```
+/autobrowse --task schwab-login --env remote
+```
+
+---
+
+## Fixed iterations
+
+Run exactly 10 cycles then stop:
+
+```
+/autobrowse --task my-portal --iterations 10
+```
+
+---
+
+## Multiple tasks in parallel
+
+Run all tasks in your `tasks/` directory with 5 iterations each:
+
+```
+/autobrowse --all --iterations 5 --env remote
+```
+
+Or specify tasks explicitly:
+
+```
+/autobrowse --tasks payment-portal,receipt-download,account-summary --iterations 5
+```
+
+---
+
+## Nightly skill refresh
+
+Keep skills fresh as websites change. Add to your crontab:
+
+```bash
+0 1 * * * cd /path/to/project && claude "/autobrowse --all --iterations 3 --env remote" >> autobrowse.log 2>&1
+```
+
+---
+
+## Evaluate only (no improvement)
+
+Run the inner agent once and read the trace yourself:
+
+```bash
+node ${CLAUDE_SKILL_DIR}/scripts/evaluate.mjs --task my-portal --env remote
+cat traces/my-portal/latest/summary.md
+```
+
+---
+
+## What a graduated skill looks like
+
+After several iterations, `tasks/my-portal/skill.md` will contain hard-won site knowledge:
+
+```markdown
+## Fast Path
+Navigate directly to the form — skip the landing page:
+https://portal.example.com/pay?invoice=true
+
+## Form Fields
+- Invoice number: `#wpforms-68-field_3`
+- Card number: `#wpforms-68-field_6`
+
+## Timing Rules
+- After clicking Submit: wait 3s for spinner before checking result
+- Dropdown populates 500ms after focus — don't rush
+
+## Failure Recovery
+- "Invalid credentials" = bot detection, retry with --env remote
+```
+
+This file drops straight into your stagehand/browser-use agent as a system prompt addition.
diff --git a/skills/autobrowse/LICENSE.txt b/skills/autobrowse/LICENSE.txt
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2025 Browserbase, Inc.
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
diff --git a/skills/autobrowse/README.md b/skills/autobrowse/README.md
@@ -0,0 +1,86 @@
+# AutoBrowse
+
+Self-improving browser automation via the auto-research loop. Build reliable, production-ready navigation skills for any website — overnight, autonomously.
+
+## How it works
+
+An **inner agent** browses your target site and attempts the task. An **outer agent** (you, via `/autobrowse`) reads what went wrong and improves the instructions. Repeat until it passes consistently.
+
+The output is a `skill.md` — a site-specific playbook any agent can follow. Once mature, it replaces expensive LLM exploration with deterministic, cached navigation. Typical cost reduction: **80%+**.
+
+## Requirements
+
+- Node.js 18+
+- [Claude Code](https://claude.ai/code)
+- `browse` CLI: `npm install -g @browserbasehq/browse-cli`
+- `ANTHROPIC_API_KEY` in your environment
+- For bot-protected sites: `BROWSERBASE_API_KEY` + `BROWSERBASE_PROJECT_ID`
+
+## Setup
+
+```bash
+git clone <this-repo>
+cd autobrowse
+npm install
+cp .env.example .env   # fill in your API keys
+```
+
+## Your project structure
+
+Create this in your working directory before running `/autobrowse`:
+
+```
+your-project/
+├── tasks/
+│   └── my-portal/
+│       ├── task.md        ← describe what the agent should do
+│       └── strategy.md    ← auto-created and improved each iteration
+└── traces/                ← auto-created at runtime, add to .gitignore
+```
+
+See `references/example-task.md` for the `task.md` format.
+
+## Usage
+
+Open Claude Code in your project directory and run:
+
+```
+/autobrowse --task my-portal
+```
+
+The skill runs the inner agent, reads the trace, improves `strategy.md`, and repeats. When the task passes consistently, a `skill.md` is written alongside `strategy.md` — that's your shippable output.
+
+For multiple tasks in parallel:
+
+```
+/autobrowse --all --iterations 5 --env remote
+```
+
+## Graduated skills
+
+When a task's `skill.md` is ready, copy it into any agent's system prompt. It gives the agent precise, site-specific instructions — no more blind exploration on every run.
+
+See `references/example-skill.md` for the format of a finished skill.
+
+## Environment modes
+
+| | Local | Remote (Browserbase) |
+|-|-------|----------------------|
+| Setup | Chrome required | API key required |
+| Stealth / CAPTCHA | No | Yes |
+| Parallelism | 1 task at a time | Up to 20+ |
+
+Use `--env remote` for sites with bot detection or when running multiple tasks simultaneously.
+
+## Architecture
+
+Inspired by [Karpathy's autoresearch](https://github.com/karpathy/autoresearch) — the same loop that optimizes ML experiments, applied to browser automation.
+
+```
+outer agent (Claude Code + /autobrowse skill)
+  └── reads trace → improves strategy.md → repeats
+
+inner agent (scripts/evaluate.mjs → Anthropic API)
+  └── browse open → snapshot → click → snapshot → ...
+  └── writes traces/ with summary, full trace, screenshots
+```
diff --git a/skills/autobrowse/REFERENCE.md b/skills/autobrowse/REFERENCE.md
@@ -0,0 +1,53 @@
+# AutoBrowse Reference
+
+## evaluate.mjs flags
+
+```bash
+node ${CLAUDE_SKILL_DIR}/scripts/evaluate.mjs --task <name> [options]
+```
+
+| Flag | Default | Description |
+|------|---------|-------------|
+| `--task <name>` | required | Task name — matches `tasks/<name>/` directory |
+| `--env local\|remote` | `local` | Browser environment |
+| `--model <model>` | `claude-sonnet-4-6` | Claude model for the inner agent |
+| `--run-number N` | auto-increment | Force a specific run number |
+
+## Environment variables
+
+| Variable | Required | Description |
+|----------|----------|-------------|
+| `ANTHROPIC_API_KEY` | Yes | Claude API key |
+| `BROWSERBASE_API_KEY` | Remote only | Browserbase API key |
+| `BROWSERBASE_PROJECT_ID` | Remote only | Browserbase project ID |
+
+## Trace artifacts
+
+Each run writes to `traces/<task>/run-NNN/`:
+
+| File | Description |
+|------|-------------|
+| `summary.md` | Duration, cost, turn-by-turn decision log, final output |
+| `trace.json` | Full tool call log — every command and response |
+| `messages.json` | Raw Anthropic API message history |
+| `screenshots/` | Visual captures saved during the run |
+
+`traces/<task>/latest` is a symlink to the most recent run.
+
+## Models
+
+| Model | Cost | Best for |
+|-------|------|----------|
+| `claude-sonnet-4-6` | $$ | Default — good balance of speed and accuracy |
+| `claude-opus-4-6` | $$$$ | Hardest tasks, complex multi-step workflows |
+| `claude-haiku-4-5-20251001` | $ | Simple tasks, high-volume iteration |
+
+## Skill lifecycle
+
+```
+task.md        → input (you write this, don't edit after)
+strategy.md    → working file (auto-improved each iteration)
+skill.md       → output (graduated from strategy.md when ready to ship)
+```
+
+A task is ready to graduate when it passes on 2+ of the last 3 consecutive runs.