Skip to content

abczsl520/browser-use-skill

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

4 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐ŸŒ Browser-Use Skill for OpenClaw

Stop fighting with snapshotโ†’act loops. Let AI handle complex browser automation end-to-end.

ClawHub License: MIT Browser-Use Wiki


๐Ÿ˜ค The Problem

OpenClaw's built-in browser tool is great for simple tasks โ€” screenshot, click a button, done. But for multi-step workflows, it becomes a nightmare:

Agent: *takes snapshot* โ†’ *clicks wrong button* โ†’ *takes snapshot* โ†’ *page changed* โ†’ 
       *confused* โ†’ *clicks again* โ†’ *popup appeared* โ†’ *lost* โ†’ โŒ Failed

Sound familiar? Login forms, dynamic pages, popups, anti-bot detection โ€” the built-in tool wasn't designed for these.

โœ… The Solution

This skill integrates Browser-Use (38k+ โญ) โ€” an AI browser agent that sees pages like a human and completes entire workflows autonomously.

You:          "Log into Reddit and post this article to r/python"
Browser-Use:  โœ… Opens login โ†’ types credentials โ†’ handles CAPTCHA wait โ†’ 
              navigates to submit โ†’ fills title & body โ†’ clicks Post โ†’ returns URL

One task in, result out. No manual step-by-step babysitting.

โšก Quick Start

# 1. Install the skill
clawhub install browser-use

# 2. Setup Python environment (one-time)
python3 -m venv ~/browser-use-env
source ~/browser-use-env/bin/activate
pip install browser-use playwright langchain-openai
playwright install chromium

Then just tell your OpenClaw agent:

"็”จ browser-use ็™ปๅฝ• Reddit ๅ‘ไธชๅธ–ๅญ"

The skill handles everything: mode selection, script generation, execution, and error recovery.

๐Ÿค” When to Use What

Scenario Built-in browser This Skill
Take a screenshot โœ… Free & instant โŒ Overkill
Click one button โœ… โŒ
5+ step workflow (loginโ†’navigateโ†’fillโ†’submit) โŒ Breaks easily โœ… Autonomous
Anti-bot sites (Reddit, LinkedIn, Twitter) โŒ Detected โœ… Real Chrome
Batch operations โŒ โœ…
Data scraping with complex navigation โŒ Manual โœ… Smart

Rule of thumb: If it takes more than 3 clicks, use Browser-Use.

๐Ÿ”‘ Key Features

๐ŸŽฏ Smart Task Routing

The skill knows when Browser-Use is needed vs when the built-in tool is enough. No wasted API calls.

๐Ÿ” Secure Credential Handling

Passwords use placeholder substitution โ€” the LLM never sees your real credentials:

agent = Agent(
    task="Login with x_user and x_pass",
    sensitive_data={"x_user": "real@email.com", "x_pass": "S3cret!"},
)

๐Ÿ›ก๏ธ Anti-Detection (Real Chrome Mode)

Connect to your actual Chrome browser via CDP โ€” sites see a real human, zero detection:

browser = Browser(cdp_url="http://127.0.0.1:9222")

โšก Flash Mode

Skip LLM reasoning for simple steps โ€” 2x faster:

agent = Agent(task="...", flash_mode=True)

๐Ÿ”ง Built-in Failure Recovery

CAPTCHA? Timeout? Anti-spam? The skill includes a complete decision tree for common failures.

๐Ÿ“– Quick Example

import asyncio
from browser_use import Agent, ChatOpenAI, Browser

async def main():
    llm = ChatOpenAI(model="gpt-4o-mini", api_key="YOUR_KEY")
    browser = Browser(cdp_url="http://127.0.0.1:9222")  # Real Chrome
    
    agent = Agent(
        task="""
        1. Go to https://news.ycombinator.com
        2. Extract the top 5 story titles and URLs
        3. Return them as a formatted list
        """,
        llm=llm, browser=browser, use_vision=True, max_steps=15,
    )
    result = await agent.run()
    print(result)

asyncio.run(main())

๐ŸŽฎ LLM Compatibility

LLM Works Best For
GPT-4o-mini โœ… Default choice โ€” fast & cheap
GPT-4o โœ… Complex reasoning tasks
Claude 3.5+ โœ… Good alternative
Gemini โŒ Structured output incompatible

๐Ÿ“š Documentation

๐Ÿ“– Full Wiki โ†’

Guide What You'll Learn
Getting Started Install, setup, first automation
Mode A vs Mode B Built-in Chromium vs Real Chrome
Task Writing Guide Write prompts that work first try
Sensitive Data Secure password handling + 2FA
Real-World Examples Copy-paste recipes
Troubleshooting Fix common issues
FAQ Quick answers

๐Ÿ”— Related

๐Ÿค Contributing

Found a bug? Have an idea? Open an issue or submit a PR!

๐Ÿ“„ License

MIT โ€” Use it however you want.


โญ If this skill saved you time, consider starring the repo โ€” it helps others find it!

About

๐ŸŒ OpenClaw skill for Browser-Use โ€” AI-powered browser automation for complex multi-step workflows (login, form filling, scraping, posting)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages