Browser automation for AI agents and developers. Search the web, test browser-based applications, and record automation scripts — all from your editor.
WebCure is a hybrid VS Code extension that combines two separate browser automation approaches into one package:
- Language Model Tools — 28 tools registered with VS Code's
vscode.lm.registerTool()API so GitHub Copilot can control the browser directly in chat. - File Bridge + CLI — A file-based protocol (
.webcure/input.json→output.json) with a CLI wrapper so AI agents in Cursor, Antigravity, or any terminal-capable IDE can control the same browser engine via shell commands.
Both approaches share the same Playwright-based browser engine. Whether Copilot invokes explorer_navigate or a Cursor agent runs node .webcure/cli.js navigate url=..., the same underlying code executes.
- Architecture Overview
- Prerequisites
- Installation (Step by Step)
- Quick Start: VS Code Copilot (Language Model Tools)
- Quick Start: Cursor / AI Agents (File Bridge + CLI)
- Language Model Tools Reference
- File Bridge Commands Reference
- How LM Tools and Bridge Commands Relate
- VS Code Commands (Command Palette)
- HTTP API Server
- Browser Session Recording
- API Script Recording & Python Playback (Legacy)
- JSON Script Runner
- Configuration
- Development
- Testing
- Troubleshooting
┌─────────────────────────────────────────────────────────────────────┐
│ WebCure Extension │
│ │
│ ┌──────────────────────┐ ┌──────────────────────┐ │
│ │ Language Model Tools │ │ File Bridge + CLI │ │
│ │ (VS Code Copilot) │ │ (Cursor / Agents) │ │
│ │ │ │ │ │
│ │ 28 explorer_* tools │ │ .webcure/input.json │ │
│ │ registered via │ │ .webcure/output.json │ │
│ │ vscode.lm API │ │ .webcure/cli.js │ │
│ └──────────┬───────────┘ └──────────┬───────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌───────────────────────────────────────────────┐ │
│ │ Shared Tool Instances (28 tools) │ │
│ └───────────────────────┬───────────────────────┘ │
│ │ │
│ ▼ │
│ ┌───────────────────────────────────────────────┐ │
│ │ BrowserManager (Playwright-core) │ │
│ │ Uses system Chrome or Edge │ │
│ └───────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────-┐ ┌─────────────────────┐ ┌─────────────────┐ │
│ │ HTTP API Server │ │ API Script Recorder │ │ Browser Session │ │
│ │ (port 5678) │ │ (Legacy) │ │ Recorder │ │
│ └────────────────-─┘ └─────────────────────┘ └─────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
Key points:
- The 28 Language Model Tool classes are the core engine. They implement
vscode.LanguageModelTool. - The file bridge handles 61 command names — 38 map to the 28 tool instances, plus 23 bridge-only commands for scrolling, recording, etc.
- Commands like
scrollDown,doubleClick,rightClick,launchBrowser,getPageText,highlightexist only in the file bridge — they have no LM tool equivalent because they are simple Playwright calls that don't need the full tool infrastructure. - The HTTP API server provides a third access path to the same tools via
POST /invoke.
- Node.js 18+ — check with
node --version - npm — comes with Node.js
- Google Chrome or Microsoft Edge — WebCure uses
playwright-corewhich connects to your system browser (it does not download Chromium automatically) - VS Code 1.95+ — for Language Model Tools support (requires Copilot)
cd ~/Developer
git clone https://github.com/naveedulislam/webcure.git webcureOr if you already have the source:
cd ~/Developer/webcurenpm installnpm run compileThis runs tsc and outputs JavaScript to the out/ directory.
npm run packageThis produces webcure-<version>.vsix (e.g. webcure-1.0.0.vsix) in the project root. The version comes from package.json.
VS Code (command line):
code --install-extension webcure-1.0.0.vsixVS Code (graphical):
- Open VS Code
- Press
Cmd+Shift+P(macOS) orCtrl+Shift+P(Windows/Linux) - Type Extensions: Install from VSIX...
- Navigate to
~/Developer/webcure/ - Select
webcure-1.0.0.vsix - Restart VS Code when prompted
Cursor (command line):
cursor --install-extension webcure-1.0.0.vsixCursor (graphical):
- Open Cursor
- Press
Cmd+Shift+P - Type Extensions: Install from VSIX...
- Select
webcure-1.0.0.vsix - Restart Cursor
- Open the Command Palette:
Cmd+Shift+P - Type WebCure
- You should see all WebCure commands listed (Navigate to URL, Click Element, etc.)
Once the extension is installed and VS Code has restarted, the 28 explorer_* tools are automatically registered with VS Code's Language Model API. GitHub Copilot can use them directly in chat.
Example conversation:
You: Navigate to https://example.com and take a screenshot
Copilot: (usesexplorer_navigatethenexplorer_take_screenshotautomatically)
Done — screenshot saved toscreenshot.png
You: Find the login form data, fill in the email and password fields
Copilot: (usesexplorer_snapshot,explorer_fill_form)
You can also reference tools explicitly by typing # in chat:
#explorer_navigate to https://news.ycombinator.com
#explorer_snapshot to see the page structure
#explorer_click on "new" link
No configuration required. The tools are available as soon as the extension activates (onStartupFinished).
When the extension activates, it creates a .webcure/ directory in your workspace root containing:
cli.js— CLI helper that writesinput.jsonand polls foroutput.jsoninput.json— Written by the agent, read by the extensionoutput.json— Written by the extension, read by the agent
- Agent runs:
node .webcure/cli.js navigate url=https://example.com cli.jswrites{"command": "navigate", "args": {"url": "https://example.com"}}to.webcure/input.json- The extension detects the file via
fs.watch, executes the command, writes the result to.webcure/output.json, and deletesinput.json cli.jspolls foroutput.json, reads it, prints the result, and deletes it
# Launch a browser and navigate
node .webcure/cli.js launchBrowser url=https://example.com
# Click by visible text
node .webcure/cli.js click target="Sign In"
# Click with spatial targeting (click "Edit" below "Profile")
node .webcure/cli.js click target="Edit" below="Profile"
# Type into a field
node .webcure/cli.js typeText text="hello" into="Search"
# Take an accessibility snapshot (assigns refs e1, e2, ...)
node .webcure/cli.js snapshot
# Find an element (returns a ref for later use)
node .webcure/cli.js find text="Submit"
# Interact using a ref from snapshot or find
node .webcure/cli.js interact action=click ref=e3
# Fill a form using refs
node .webcure/cli.js fillForm 'fields=[{"name":"Email","type":"textbox","ref":"e3","value":"test@example.com"}]'
# Scroll the page
node .webcure/cli.js scrollDown pixels=500
# Take a screenshot
node .webcure/cli.js screenshot filename=result.png
# Get page text content
node .webcure/cli.js getPageText
# Scrape menu structure
node .webcure/cli.js scrapeMenu
# Scrape page structure (forms, tables)
node .webcure/cli.js scrapePage
# Close the browser
node .webcure/cli.js closeBrowser
# See all available commands
node .webcure/cli.js helpYes. The file bridge routes commands to the same 28 tool instances that Copilot uses. When you run node .webcure/cli.js navigate url=..., the bridge internally calls NavigateTool.invoke() — the exact same code path as when Copilot uses explorer_navigate. The bridge command names are just friendlier aliases (e.g., navigate instead of explorer_navigate, click instead of explorer_click).
These 28 tools are registered with VS Code's vscode.lm.registerTool() API. Copilot invokes them automatically based on user requests.
| Tool Name | Display Name | Description |
|---|---|---|
explorer_navigate |
Navigate to URL | Open a URL in the browser |
explorer_resize |
Resize Window | Resize viewport (preset fullscreen or custom width/height) |
explorer_extract |
Extract Content | Extract visible text from the page or a CSS selector |
explorer_click |
Click Element | Click by ref, text, or selector with spatial targeting |
explorer_hover |
Hover Element | Hover over an element |
explorer_type |
Type Text | Type into an input field |
explorer_type_from_file |
Type Text from File | Type large text content from a file into an input |
explorer_wait_for |
Wait For | Wait for text to appear/disappear or a fixed time |
explorer_wait_for_element |
Wait For Element | Wait for an element to be visible/hidden/attached/detached |
explorer_select_option |
Select Option | Pick a value from a dropdown |
explorer_fill_form |
Fill Form | Fill multiple form fields using refs from snapshot |
explorer_take_screenshot |
Take Screenshot | Capture screenshot (full page or specific element) |
explorer_close |
Close Browser | Close the page and release resources |
explorer_console_messages |
Get Console Messages | Return recent browser console messages |
explorer_drag |
Drag And Drop | Drag from one element to another using refs |
explorer_evaluate |
Evaluate Script | Run JavaScript in the page context |
explorer_file_upload |
Upload File | Upload files via file input or file chooser |
explorer_handle_dialog |
Handle Dialog | Accept or dismiss alert/confirm/prompt dialogs |
explorer_navigate_back |
Navigate Back | Go back in browser history |
explorer_network_requests |
Get Network Requests | Return observed network requests |
explorer_press_key |
Press Key | Press a keyboard key (Enter, Tab, etc.) |
explorer_snapshot |
Accessibility Snapshot | Capture accessibility tree with element refs (e1, e2, ...) |
explorer_tabs |
Manage Tabs | List, create, close, or select browser tabs |
explorer_install |
Install Browser | No-op (uses system Chrome/Edge) |
explorer_find |
Find Element | Find elements by text, position, or selector — returns a ref |
explorer_interact |
Interact with Element | Multi-action: click, type, hover, clear, select, focus, check, uncheck |
explorer_scrape_menu |
Scrape Menu Structure | Extract hierarchical navigation menus as JSON |
explorer_scrape_page |
Scrape Page Content | Extract forms, tables, and filters as structured JSON |
The file bridge accepts all of the above tool commands (by alias) plus additional bridge-only commands:
These bridge commands invoke the same tool code as the LM tools:
| Bridge Command | Maps To LM Tool | Notes |
|---|---|---|
navigate |
explorer_navigate |
url arg (or target as alias) |
click |
explorer_click |
target arg maps to text |
hover |
explorer_hover |
Same spatial targeting (above, below, etc.) |
typeText |
explorer_type |
into arg maps to field identifier |
typeFromFile |
explorer_type_from_file |
|
pressKey |
explorer_press_key |
key or target arg |
selectOption |
explorer_select_option |
comboBox + value args |
fillForm |
explorer_fill_form |
|
screenshot |
explorer_take_screenshot |
filename or outputPath args |
consoleMessages |
explorer_console_messages |
|
networkRequests |
explorer_network_requests |
|
handleDialog |
explorer_handle_dialog |
|
uploadFile |
explorer_file_upload |
|
evaluate |
explorer_evaluate |
expression arg maps to function |
navigateBack |
explorer_navigate_back |
|
goBack |
explorer_navigate_back |
Alias for navigateBack |
snapshot |
explorer_snapshot |
|
find |
explorer_find |
|
interact |
explorer_interact |
|
scrapeMenu |
explorer_scrape_menu |
|
scrapePage |
explorer_scrape_page |
|
drag / dragTo |
explorer_drag |
source + target args |
close |
explorer_close |
|
closeBrowser |
explorer_close |
Alias |
tabs |
explorer_tabs |
|
listTabs |
explorer_tabs (list) |
|
newTab |
explorer_tabs (new) |
|
closeTab |
explorer_tabs (close) |
|
selectTab |
explorer_tabs (select) |
|
waitForText |
explorer_wait_for |
|
waitForElement |
explorer_wait_for_element |
|
wait |
explorer_wait_for |
ms or time arg |
resize |
explorer_resize |
|
resizeBrowser |
explorer_resize |
Alias |
fullscreenBrowser |
explorer_resize (fullscreen) |
|
extract |
explorer_extract |
These commands exist only in the file bridge and are implemented directly against BrowserManager / Playwright:
| Bridge Command | Description |
|---|---|
launchBrowser |
Open a browser window (optionally navigate to a URL) |
scrollDown |
Scroll down (default 500px, configurable via pixels arg) |
scrollUp |
Scroll up |
scrollRight |
Scroll right (default 300px) |
scrollLeft |
Scroll left |
doubleClick |
Double-click an element by text or ref |
rightClick |
Right-click an element by text or ref |
refresh |
Reload the current page |
goForward |
Go forward in browser history |
switchWindow |
Switch to a tab by title match |
getPageInfo |
Get current URL and page title |
getPageContent |
Get raw HTML content (truncated to 50KB) |
getPageText |
Get visible text content (truncated to 50KB) |
getAccessibilityTree |
Get accessibility tree (delegates to snapshot tool) |
highlight |
Visually outline an element on the page |
getDialogText |
Get text from the current dialog |
startRecording |
Begin recording API-based actions (legacy) |
stopRecording |
Stop API recording and return generated Python script |
startStepRecorder |
Start browser session recording (optionally with url) |
stopStepRecorder |
Stop browser session and generate output |
restartExtensionHost |
Restart VS Code extension host (useful after VSIX install) |
runScript |
Execute a JSON automation script |
Are the commands duplicated?
No. There is one set of 28 tool classes (in tools.ts). The LM tool registration and the bridge both point to the same instances:
- LM Path: Copilot →
vscode.lm.registerTool('explorer_navigate', NavigateTool)→NavigateTool.invoke() - Bridge Path: CLI →
input.json→ bridge →BRIDGE_TO_TOOL['navigate'] → toolInstances.navigate.invoke()
The bridge adds convenience aliases (e.g., goBack → navigateBack, closeBrowser → close) and bridge-only commands (scrolling, page info, recording) that don't need the full LM tool interface.
The Command Palette exposes 41 VS Code commands total. Of these, 27 (e.g., webcure.testNavigate) are test harnesses — they prompt for input via vscode.window.showInputBox and invoke the same tool instances, displaying results in the Output panel. The remaining commands handle recording, assertions, the API server, and script execution.
All commands are accessible via Cmd+Shift+P (macOS) / Ctrl+Shift+P (Windows/Linux) under the WebCure category:
| Command | What it does |
|---|---|
| WebCure: Navigate to URL | Prompts for a URL and navigates |
| WebCure: Click Element | Prompts for text/selector and clicks |
| WebCure: Hover Element | Hover over an element |
| WebCure: Type Text | Type into a field |
| WebCure: Type Text from File | Type file contents into a field |
| WebCure: Press Key | Press a keyboard key |
| WebCure: Take Screenshot | Save a screenshot |
| WebCure: Take Snapshot | Capture accessibility tree with refs |
| WebCure: Find Element | Find element by text/selector |
| WebCure: Interact with Element | Multi-action on an element by ref |
| WebCure: Select Option | Pick a dropdown value |
| WebCure: Fill Form | Fill multiple form fields |
| WebCure: Drag Element | Drag and drop |
| WebCure: Evaluate JavaScript | Run JS in page context |
| WebCure: Extract Text Content | Extract visible text |
| WebCure: Wait For Text | Wait for text on page |
| WebCure: Wait For Element | Wait for element state |
| WebCure: Resize Browser Window | Resize viewport |
| WebCure: Navigate Back | Go back in history |
| WebCure: Manage Tabs | List/create/close/select tabs |
| WebCure: Close Browser | Close the browser |
| WebCure: Get Console Messages | Show browser console output |
| WebCure: Get Network Requests | Show observed network requests |
| WebCure: Handle Dialog | Accept/dismiss a dialog |
| WebCure: Upload File | Upload files |
| WebCure: Scrape Menu/Navigation | Extract menu structure |
| WebCure: Scrape Page Structure | Extract forms/tables |
| WebCure: Tools Menu | Quick-pick menu of all tools |
| WebCure: Start API Server | Start the HTTP API on port 5678 |
| WebCure: Stop API Server | Stop the HTTP API |
| WebCure: Record API Script (Legacy) | Begin recording API-based browser actions |
| WebCure: Stop API Script Recording | Stop and generate Python script |
| WebCure: Record Browser Session | Start session recording (choose output) |
| WebCure: Stop Browser Session | Stop session recording and save output |
| WebCure: Insert Wait Step | Insert a timed pause at a specific point |
| WebCure: Assert: Element | Pick assertion type, then click target element (Cmd+Shift+A) |
| WebCure: Assert: Page Title | Assert current page title matches |
| WebCure: Assert: Page URL | Assert current page URL (exact or contains) |
| WebCure: Assert: Element Count | Click element, then assert how many matching elements exist |
| WebCure: Assert: Full Page Snapshot | Assert page body contains specific text |
| WebCure: Run Script | Execute a JSON automation script |
WebCure includes an HTTP API server for programmatic access from external scripts or tools.
Enable via settings (webcure.api.enabled: true) or start manually:
Cmd+Shift+P→ WebCure: Start API Server
Endpoints:
# Invoke any command
curl -X POST http://localhost:5678/invoke \
-H "Content-Type: application/json" \
-d '{"command": "navigate", "args": {"url": "https://example.com"}}'
# List available tools
curl http://localhost:5678/tools
# Health check
curl http://localhost:5678/healthWebCure can record every user interaction in the browser and produce human-readable documentation and/or a Playwright Python test script. Choose from three output modes when you start a recording.
Command Palette → WebCure: Record Browser Session presents a mode picker:
| Mode | What is produced |
|---|---|
| Markdown + Screenshots | A timestamped folder containing Recording.md plus a screenshot for each step. Great for test documentation, bug reports, and onboarding guides. |
| Python Test Script | A standalone Playwright Python script with self-healing locators. No screenshots folder is created. |
| Both | Markdown with screenshots and a Python script, saved together in the same session folder. |
- Start recording: Command Palette → WebCure: Record Browser Session
- Choose output mode (Markdown / Python / Both)
- (Markdown/Both) Optionally enter a folder name (blank = auto-timestamp)
- (Python/Both) Optionally enter a script filename (default
test_recording.py) - (Python/Both) Optionally enter a default wait between steps in seconds (e.g.
1). If set,time.sleep(N)is added after each action in the generated script. - Optionally enter an initial URL (defaults to
https://demo.testfire.net) - Interact normally — every click, form input, file upload, and Enter key press is captured
- Stop recording: Command Palette → WebCure: Stop Browser Session, or simply close the browser window
| User Action | Recorded As | Screenshot |
|---|---|---|
| Click a link or button | Clicked on button 'Login' |
Taken after the page reacts to the click |
| Click a dropdown trigger | Clicked on button 'Options' |
Captured via deferred pointerdown |
| Click a menu item | Selected 'Edit' from 'Options' dropdown |
ARIA-aware with trigger label lookup |
| Select a dropdown option | Selected menu item 'Orange' |
Deferred pointerdown for Radix Select |
Change a <select> |
Selected 'Active' from 'Status' |
Taken after change |
| Type into a field | Typed 'admin' into 'Username' |
Taken immediately (shows the typed value) |
| Type a password | Typed '********' into 'Password' |
Password value is masked |
| Press Enter | Pressed 'Enter' on 'Search' |
Taken after the page reacts |
| Upload a file | Uploaded file to 'Resume' (path captured via dialog) |
Taken after file selected |
| Navigate to URL | Performed 'navigate' on 'Navigated to https://...' |
Taken after page load |
| Close the browser | Performed 'close' on 'Browser window closed' |
No screenshot (browser is gone) |
| Insert Wait Step | Wait 2 seconds |
No screenshot (no UI change) |
Markdown + Screenshots and Both modes create a timestamped folder:
WebCure_Steps_2026-03-09_22-13-00/
├── Recording.md # Markdown log with all steps
├── test_recording.py # Python script (Both mode only)
├── step_1.png
├── step_2.png
└── ...
Python only mode writes a single timestamped script directly to the workspace root:
test_recording_2026-03-09_22-13-00.py
The Markdown file contains structured entries like:
### Step 3
**Action:** Typed 'admin' into 'Username'
The Python script uses self-healing locators that try multiple strategies in confidence order (test ID → ID → ARIA role → link text → visible text → name → CSS → XPath):
# Step 3: Typed 'admin' into 'Username'
self_healing_fill(page, [
{"strategy": "id", "value": "uid", "confidence": 0.95},
{"strategy": "ariaLabel", "value": "Username", "confidence": 0.85},
{"strategy": "css", "value": "input[name='uid']", "confidence": 0.6},
], "admin")During a recording session, you can insert assertions to verify the state of the page or specific elements. Assertions are recorded as steps and generate corresponding Python assertion calls in the output script.
- Start a recording — Command Palette → WebCure: Record Browser Session (Python or Both mode)
- Interact normally — click, type, navigate
- Insert an assertion — press
Cmd+Shift+A(macOS) /Ctrl+Shift+A, pick the assertion type, then click the target element - Stop recording — the generated Python script includes assertion calls with pass/fail logging
Press Cmd+Shift+A or Command Palette → Assert: Element to see a QuickPick menu:
| Assertion | What It Verifies | Example Use Case |
|---|---|---|
| Visible | Element is visible on the page | Verify a success message appeared after form submission |
| Not Visible | Element is NOT visible | Verify an error banner disappeared after correction |
| Text | Element contains specific text | Verify a heading shows "Welcome, Admin" after login |
| Value | Input/select has a specific value | Verify a form field retained its value after page reload |
| Checked | Checkbox/radio is checked | Verify "Remember me" checkbox is checked by default |
| Not Checked | Checkbox/radio is NOT checked | Verify a terms checkbox starts unchecked |
| Enabled | Element is enabled (clickable) | Verify the submit button is active after filling required fields |
| Disabled | Element is disabled | Verify the submit button is greyed out before filling the form |
| Attribute | Element attribute has expected value | Verify a link's href attribute points to the correct URL |
After picking the type, click the target element on the page. WebCure captures the element's locators and current state, then records the assertion step.
These don't require clicking an element:
| Command | What It Verifies | Example Use Case |
|---|---|---|
| Assert: Page Title | document.title matches exactly |
Verify the page title changes to "Dashboard" after login |
| Assert: Page URL | URL matches (exact or contains) | Verify redirect to /dashboard after successful login |
| Assert: Element Count | Number of matching elements | Verify exactly 5 search results are displayed |
| Assert: Full Page Snapshot | Page body contains specific text | Verify "Sign Off" link appears somewhere on the page |
Generated Python scripts include a built-in logging system:
- Each step is wrapped in
try/exceptwith_record_step()for pass/fail tracking - Terminal output shows ✅/❌ icons with step descriptions in real time
- Log file is saved as
test_results_YYYYMMDD_HHMMSS.log - Summary prints at the end:
TEST SUMMARY: 34/35 steps passed, 1 failed - Exit code is
1if any step failed,0if all passed
Example output:
✅ Step 1: Navigate to https://example.com — PASS
✅ Step 2: Assert page title is 'Example' — PASS
✅ Step 3: Typed 'admin' into username — PASS
❌ Step 4: Assert heading text contains 'Welcome' — FAIL [Expected text 'Welcome' in 'Hello']
============================================================
TEST SUMMARY: 3/4 steps passed, 1 failed
============================================================
Failed steps:
Step 4: Assert heading text contains 'Welcome'
Error: Expected text 'Welcome' in 'Hello'
Full log saved to: test_results_20260329_160005.log
============================================================
Assertions use the same self-healing locator system as action steps:
# Step 5: Assert heading text contains 'Hello Admin User'
assert_element_text(page, [
{"strategy": "css", "value": "h1", "confidence": 0.8},
{"strategy": "xpath", "value": "//h1", "confidence": 0.5},
], "Hello Admin User")
# Step 6: Assert page URL contains 'dashboard'
assert_page_url(page, "dashboard", "contains")
# Step 7: Assert exactly 2 checkboxes exist
assert_element_count(page, [
{"strategy": "css", "value": "input[type='checkbox']", "confidence": 0.9},
], 2)To insert an explicit pause at a specific point during recording: Command Palette → WebCure: Insert Wait Step. You will be prompted for a duration in seconds. This adds a time.sleep(N) call at that exact position in the generated Python script.
For a uniform wait after every step, use the default wait between steps option at recording start instead.
When the browser opens a native file-chooser dialog, WebCure:
- Intercepts it automatically
- Prompts you to enter the file path (or leave blank to record a placeholder)
- Records the upload step with captured locators
The generated Python script emits an upload_file() call with self-healing locators, or a # TODO comment if no path was provided.
The step recorder uses multiple heuristics to produce human-readable element names:
<label for="...">associations — preferred for form fields- Parent
<label>wrappers — for fields wrapped inside labels - Adjacent table cell text — for table-based layouts (e.g.,
"Username:"from a neighboring<td>) - Previous sibling text —
<span>,<label>,<b>elements before the input - Button value/text — for
<button>and<input type="submit">, the button's own text takes priority - ARIA attributes —
aria-label,title,placeholder - ARIA roles —
role="menuitem",role="option", etc. with parent menu trigger lookup viaaria-labelledby - Element ID or name — as a fallback
Each step also records CSS selector and XPath in HTML comments for reference.
The step recorder handles modern component libraries (Radix UI, shadcn/ui, etc.) that use pointerdown events and portal-rendered menus:
- Dropdown triggers: Captured via a "deferred pointerdown" strategy — the recorder captures every
pointerdownon interactive elements, waits 400ms, and records it as a click only if no matchingclickevent follows. This handles Radix UI DropdownMenu triggers that open a portal onpointerdown, causing theclickto land ondocument.body. - Select options: Radix UI Select options (
role="option") are removed from the DOM betweenpointerdownandmouseup, so theclickevent never fires on the option. The deferred pointerdown timer catches these and records the selection. - Menu items: Elements with
role="menuitem"(ordata-slot="dropdown-menu-item"/data-slot="select-item") are detected and the recorder walks up the DOM to find the parent[role="menu"]or[role="listbox"]container, then resolves the trigger label viaaria-labelledby.
The step recorder can also be started and stopped via the CLI, enabling AI agents to trigger recordings programmatically:
# Markdown mode (default)
node .webcure/cli.js startStepRecorder url=https://example.com
# Python mode with 1-second wait between steps
node .webcure/cli.js startStepRecorder url=https://example.com mode=python defaultWaitSeconds=1
# ... interact with the browser ...
node .webcure/cli.js stopStepRecorderIf you close the browser window while recording is active, the recorder:
- Logs a final "Browser window closed" step (without a screenshot)
- Waits for any pending steps to finish writing
- Automatically stops recording and opens the Markdown preview
WebCure can record your browser actions and generate a Python script that replays them via the API server. This is the original recording method — for most use cases, Browser Session Recording is recommended instead.
The webcure Python package lives in the python/ directory of this repository. Install it once:
pip install /path/to/webcure/pythonOr in development/editable mode:
pip install -e /path/to/webcure/pythonThis provides module-level convenience functions (navigate, click, type_text, etc.) that the generated scripts import.
- Open the Command Palette (
Cmd+Shift+P) - Run WebCure: Record API Script (Legacy)
- Perform browser actions — navigate, click, type, resize, etc. using any WebCure command (Command Palette or Copilot)
- Run WebCure: Stop API Script Recording (works even if the browser was closed during the session)
All actions are logged in the WebCure Tools Output channel. Start and stop recording events are also logged there with timestamps.
A new Python script opens in your editor with the recorded actions.
The script needs the WebCure API server running to execute commands:
- Start the API server — Command Palette → WebCure: Start API Server (or it auto-starts when you stop recording)
- Run the script:
python recording.py#!/usr/bin/env python3
# Auto-generated by WebCure
from webcure import click, close_browser, navigate, resize_browser, type_text
navigate("https://demo.testfire.net")
resize_browser("fullscreen")
click("ONLINE BANKING LOGIN")
type_text("admin", into="#uid")
type_text("admin", into="#passw")
click("#login > table > tbody > tr:nth-child(3) > td:nth-child(2) > input[type=submit]")
click("#btnGetAccount")
close_browser()The Python client automatically detects whether a target is a CSS selector (e.g., #uid, .class, div > span) or visible text (e.g., "ONLINE BANKING LOGIN") and routes it to the correct Playwright locator strategy.
The webcure package also supports class-based usage:
from webcure import WebCure
wc = WebCure(port=5678)
wc.invoke("navigate", {"url": "https://example.com"})
print(wc.health()) # True if API server is running
print(wc.tools()) # List available tool namesTo change the default port for module-level functions:
import webcure
webcure.set_port(9999)
webcure.navigate("https://example.com")Execute multi-step automation scripts with variables, capture patterns, and retry logic.
{
"name": "Login and verify dashboard",
"stopOnError": true,
"retries": 1,
"retryDelay": 2000,
"variables": {
"baseUrl": "https://example.com",
"username": "admin"
},
"steps": [
{
"command": "navigate",
"args": { "url": "${baseUrl}/login" }
},
{
"command": "typeText",
"args": { "text": "${username}", "into": "Username" }
},
{
"command": "click",
"args": { "target": "Sign In" }
},
{
"command": "find",
"args": { "text": "Dashboard" },
"captureRef": "dashRef"
},
{
"command": "interact",
"args": { "action": "click", "ref": "${dashRef}" }
}
]
}- Command Palette:
Cmd+Shift+P→ WebCure: Run Script → select a.jsonfile - CLI:
node .webcure/cli.js runScript file=/path/to/script.json
| Feature | Description |
|---|---|
| Variables | Define at script level, use ${varName} in any step arg |
| captureRef | Auto-extract [ref=eN] from step output into a variable |
| capturePattern | Regex with capture group to extract values from output text |
| captureValue | Map { variableName: "property.path" } to extract structured values |
| Retries | Script-level retries + retryDelay with per-step overrides |
| stopOnError | Continue or halt on failure (default: true, per-step override) |
| Command Aliases | Both camelCase and snake_case accepted |
Open Settings (Cmd+, / Ctrl+,) and search for webcure:
| Setting | Default | Description |
|---|---|---|
webcure.api.enabled |
false |
Enable the HTTP API server on activation |
webcure.api.port |
5678 |
Port for the HTTP API server |
webcure.api.host |
127.0.0.1 |
Host address for the HTTP API server |
webcure.bridge.enabled |
true |
Enable the file bridge for AI agent integration |
| Variable | Default | Description |
|---|---|---|
WEBEXPLORER_BROWSER |
(Chrome) | Set to msedge to use Microsoft Edge instead of Chrome |
cd ~/Developer/webcure
# Install dependencies
npm install
# Compile TypeScript
npm run compile
# Watch mode (recompile on file changes)
npm run watch
# Package into .vsix
npm run package- Open the
webcure/folder in VS Code - Press
F5 - A new VS Code window opens with the extension loaded
- Open the Command Palette and type "WebCure" to test commands
webcure/
├── src/
│ ├── extension.ts # Entry point: registers tools, bridge, API, commands
│ ├── tools.ts # 28 Language Model Tool classes
│ ├── browserManager.ts # Playwright-core browser singleton
│ ├── apiServer.ts # HTTP API server
│ ├── constants.ts # Bridge directory/file names
│ ├── types.ts # Shared types
│ ├── bridge/
│ │ ├── file-bridge.ts # File-based command router
│ │ └── cli-template.js # CLI helper (copied to .webcure/)
│ └── recorder/
│ ├── action-log.ts # Start/stop/record actions
│ ├── script-generator.ts # Convert actions to Python
│ ├── step-recorder.ts # Step recorder (Markdown + Python + assertions)
│ └── element-rules-engine.ts # W3C/ARIA element classification engine
├── python/
│ ├── pyproject.toml # Python package metadata
│ ├── setup.py # Package setup (pip install)
│ └── webcure/
│ ├── __init__.py # Convenience functions (navigate, click, etc.)
│ └── client.py # WebCure API client class
├── tests/
│ ├── MANUAL-TEST-RESULTS.md # Manual test documentation
│ ├── bridge-integration-tests.sh # Automated bridge integration tests
│ ├── unit/
│ │ ├── tools.test.ts # Unit tests (bridge routing, recording, params)
│ │ └── element-rules-engine.test.ts # 113 unit tests for the rules engine
│ └── integration/
│ ├── live_engine_test.py # 63 Python live browser tests against real sites
│ ├── test_assertions.py # 46 assertion helper integration tests
│ ├── test_recorded_assertions.py # 35-step end-to-end recorded-style test
│ └── screenshots/ # Test screenshots captured during live tests
├── status/
│ ├── project_status_01.md # Initial release status report
│ ├── project_status_02.md # Recording fix & Python package
│ ├── project_status_03.md # Action persistence & interact tool fixes
│ ├── project_status_04.md # Step recorder feature
│ ├── project_status_05.md # Radix UI fixes & CLI step recorder
│ ├── project_status_06.md # Deferred pointerdown & Select support
│ ├── project_status_07.md # Element rules engine
│ ├── project_status_08.md # Python test script generation
│ └── project_status_09.md # Assertion recording & pass/fail logging
├── out/ # Compiled JavaScript (tsc output)
├── package.json # Extension manifest + tool/command declarations
├── tsconfig.json # TypeScript configuration
└── README.md
# TypeScript unit tests (playwright-core + tsx)
npm install # installs playwright-core, tsx, typescript from package.json
# Python live browser integration tests
pip install playwright # Python Playwright bindings (v1.58+)
# Browser binaries (shared by both TS and Python tests)
npx playwright install chromium# TypeScript unit tests — 113 tests for the Element Rules Engine
npx tsx tests/unit/element-rules-engine.test.ts
# Bridge routing + parameter transformation unit tests
npm run test:unit
# Python live browser integration tests — 63 tests against real websites
# (demo.testfire.net, the-internet.herokuapp.com, Radix UI Themes Playground, W3C WAI-ARIA)
python3 tests/integration/live_engine_test.py
# Assertion helper integration tests — 46 tests against live websites
python3 tests/integration/test_assertions.py
# End-to-end recorded-style assertion test — 35 steps with pass/fail logging
python3 tests/integration/test_recorded_assertions.py
# Automated bridge integration tests (requires VS Code + extension active)
bash tests/bridge-integration-tests.sh- Make sure VS Code version is 1.95 or later (required for
vscode.lm.registerTool) - Check the Output panel → select "WebCure Tools" for error messages
- WebCure uses
playwright-corewhich connects to your system Chrome (not a bundled Chromium) - Make sure Google Chrome or Microsoft Edge is installed
- To use Edge: set environment variable
WEBEXPLORER_BROWSER=msedge
- Ensure GitHub Copilot is active and you have a Copilot subscription
- The tools should appear when you type
#explorer_in Copilot chat - Try asking Copilot explicitly: "Use explorer_navigate to go to https://example.com"
- Check that
webcure.bridge.enabledistruein settings - Verify
.webcure/directory exists in your workspace root - Check that
input.jsonis being created (the CLI writes it) - Look at the Output panel → "WebCure Tools" for errors
- Run
npm installfirst to ensure all dependencies are installed - If
vsceis not found:npx @vscode/vsce package