WebCure

Browser automation for AI agents and developers. Search the web, test browser-based applications, and record automation scripts — all from your editor.

WebCure is a hybrid VS Code extension that combines two separate browser automation approaches into one package:

Language Model Tools — 28 tools registered with VS Code's vscode.lm.registerTool() API so GitHub Copilot can control the browser directly in chat.
File Bridge + CLI — A file-based protocol (.webcure/input.json → output.json) with a CLI wrapper so AI agents in Cursor, Antigravity, or any terminal-capable IDE can control the same browser engine via shell commands.

Both approaches share the same Playwright-based browser engine. Whether Copilot invokes explorer_navigate or a Cursor agent runs node .webcure/cli.js navigate url=..., the same underlying code executes.

Architecture Overview
Prerequisites
Installation (Step by Step)
Quick Start: VS Code Copilot (Language Model Tools)
Quick Start: Cursor / AI Agents (File Bridge + CLI)
Language Model Tools Reference
File Bridge Commands Reference
How LM Tools and Bridge Commands Relate
VS Code Commands (Command Palette)
HTTP API Server
Browser Session Recording
- Assertions
API Script Recording & Python Playback (Legacy)
JSON Script Runner
Configuration
Development
Testing
Troubleshooting

Architecture Overview

┌─────────────────────────────────────────────────────────────────────┐
│                          WebCure Extension                          │
│                                                                     │
│  ┌──────────────────────┐   ┌──────────────────────┐                │
│  │ Language Model Tools │   │ File Bridge + CLI    │                │
│  │ (VS Code Copilot)    │   │ (Cursor / Agents)    │                │
│  │                      │   │                      │                │
│  │ 28 explorer_* tools  │   │ .webcure/input.json  │                │
│  │ registered via       │   │ .webcure/output.json │                │
│  │ vscode.lm API        │   │ .webcure/cli.js      │                │
│  └──────────┬───────────┘   └──────────┬───────────┘                │
│             │                          │                            │
│             ▼                          ▼                            │
│  ┌───────────────────────────────────────────────┐                  │
│  │       Shared Tool Instances (28 tools)        │                  │
│  └───────────────────────┬───────────────────────┘                  │
│                          │                                          │
│                          ▼                                          │
│  ┌───────────────────────────────────────────────┐                  │
│  │       BrowserManager (Playwright-core)        │                  │
│  │          Uses system Chrome or Edge           │                  │
│  └───────────────────────────────────────────────┘                  │
│                                                                     │
│  ┌─────────────────-┐  ┌─────────────────────┐  ┌─────────────────┐ │
│  │ HTTP API Server  │  │ API Script Recorder │  │ Browser Session │ │
│  │ (port 5678)      │  │ (Legacy)            │  │ Recorder        │ │
│  └────────────────-─┘  └─────────────────────┘  └─────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘

Key points:

The 28 Language Model Tool classes are the core engine. They implement vscode.LanguageModelTool.
The file bridge handles 61 command names — 38 map to the 28 tool instances, plus 23 bridge-only commands for scrolling, recording, etc.
Commands like scrollDown, doubleClick, rightClick, launchBrowser, getPageText, highlight exist only in the file bridge — they have no LM tool equivalent because they are simple Playwright calls that don't need the full tool infrastructure.
The HTTP API server provides a third access path to the same tools via POST /invoke.

Prerequisites

Node.js 18+ — check with node --version
npm — comes with Node.js
Google Chrome or Microsoft Edge — WebCure uses playwright-core which connects to your system browser (it does not download Chromium automatically)
VS Code 1.95+ — for Language Model Tools support (requires Copilot)

Installation (Step by Step)

Step 1: Get the source code

cd ~/Developer
git clone https://github.com/naveedulislam/webcure.git webcure

Or if you already have the source:

cd ~/Developer/webcure

Step 2: Install dependencies

npm install

Step 3: Compile the TypeScript

npm run compile

This runs tsc and outputs JavaScript to the out/ directory.

Step 4: Package into a .vsix file

npm run package

This produces webcure-<version>.vsix (e.g. webcure-1.0.0.vsix) in the project root. The version comes from package.json.

Step 5: Install in your editor

VS Code (command line):

code --install-extension webcure-1.0.0.vsix

VS Code (graphical):

Open VS Code
Press Cmd+Shift+P (macOS) or Ctrl+Shift+P (Windows/Linux)
Type Extensions: Install from VSIX...
Navigate to ~/Developer/webcure/
Select webcure-1.0.0.vsix
Restart VS Code when prompted

Cursor (command line):

cursor --install-extension webcure-1.0.0.vsix

Cursor (graphical):

Open Cursor
Press Cmd+Shift+P
Type Extensions: Install from VSIX...
Select webcure-1.0.0.vsix
Restart Cursor

Step 6: Verify the installation

Open the Command Palette: Cmd+Shift+P
Type WebCure
You should see all WebCure commands listed (Navigate to URL, Click Element, etc.)

Quick Start: VS Code Copilot (Language Model Tools)

Once the extension is installed and VS Code has restarted, the 28 explorer_* tools are automatically registered with VS Code's Language Model API. GitHub Copilot can use them directly in chat.

Example conversation:

You: Navigate to https://example.com and take a screenshot
Copilot: (uses explorer_navigate then explorer_take_screenshot automatically)
Done — screenshot saved to screenshot.png

You: Find the login form data, fill in the email and password fields
Copilot: (uses explorer_snapshot, explorer_fill_form)

You can also reference tools explicitly by typing # in chat:

#explorer_navigate to https://news.ycombinator.com
#explorer_snapshot to see the page structure
#explorer_click on "new" link

No configuration required. The tools are available as soon as the extension activates (onStartupFinished).

Quick Start: Cursor / AI Agents (File Bridge + CLI)

When the extension activates, it creates a .webcure/ directory in your workspace root containing:

cli.js — CLI helper that writes input.json and polls for output.json
input.json — Written by the agent, read by the extension
output.json — Written by the extension, read by the agent

How It Works

Agent runs: node .webcure/cli.js navigate url=https://example.com
cli.js writes {"command": "navigate", "args": {"url": "https://example.com"}} to .webcure/input.json
The extension detects the file via fs.watch, executes the command, writes the result to .webcure/output.json, and deletes input.json
cli.js polls for output.json, reads it, prints the result, and deletes it

CLI Examples

# Launch a browser and navigate
node .webcure/cli.js launchBrowser url=https://example.com

# Click by visible text
node .webcure/cli.js click target="Sign In"

# Click with spatial targeting (click "Edit" below "Profile")
node .webcure/cli.js click target="Edit" below="Profile"

# Type into a field
node .webcure/cli.js typeText text="hello" into="Search"

# Take an accessibility snapshot (assigns refs e1, e2, ...)
node .webcure/cli.js snapshot

# Find an element (returns a ref for later use)
node .webcure/cli.js find text="Submit"

# Interact using a ref from snapshot or find
node .webcure/cli.js interact action=click ref=e3

# Fill a form using refs
node .webcure/cli.js fillForm 'fields=[{"name":"Email","type":"textbox","ref":"e3","value":"test@example.com"}]'

# Scroll the page
node .webcure/cli.js scrollDown pixels=500

# Take a screenshot
node .webcure/cli.js screenshot filename=result.png

# Get page text content
node .webcure/cli.js getPageText

# Scrape menu structure
node .webcure/cli.js scrapeMenu

# Scrape page structure (forms, tables)
node .webcure/cli.js scrapePage

# Close the browser
node .webcure/cli.js closeBrowser

# See all available commands
node .webcure/cli.js help

Can LM Tools Be Used Through input.json?

Yes. The file bridge routes commands to the same 28 tool instances that Copilot uses. When you run node .webcure/cli.js navigate url=..., the bridge internally calls NavigateTool.invoke() — the exact same code path as when Copilot uses explorer_navigate. The bridge command names are just friendlier aliases (e.g., navigate instead of explorer_navigate, click instead of explorer_click).

Language Model Tools Reference

These 28 tools are registered with VS Code's vscode.lm.registerTool() API. Copilot invokes them automatically based on user requests.

Tool Name	Display Name	Description
`explorer_navigate`	Navigate to URL	Open a URL in the browser
`explorer_resize`	Resize Window	Resize viewport (preset `fullscreen` or custom width/height)
`explorer_extract`	Extract Content	Extract visible text from the page or a CSS selector
`explorer_click`	Click Element	Click by ref, text, or selector with spatial targeting
`explorer_hover`	Hover Element	Hover over an element
`explorer_type`	Type Text	Type into an input field
`explorer_type_from_file`	Type Text from File	Type large text content from a file into an input
`explorer_wait_for`	Wait For	Wait for text to appear/disappear or a fixed time
`explorer_wait_for_element`	Wait For Element	Wait for an element to be visible/hidden/attached/detached
`explorer_select_option`	Select Option	Pick a value from a dropdown
`explorer_fill_form`	Fill Form	Fill multiple form fields using refs from snapshot
`explorer_take_screenshot`	Take Screenshot	Capture screenshot (full page or specific element)
`explorer_close`	Close Browser	Close the page and release resources
`explorer_console_messages`	Get Console Messages	Return recent browser console messages
`explorer_drag`	Drag And Drop	Drag from one element to another using refs
`explorer_evaluate`	Evaluate Script	Run JavaScript in the page context
`explorer_file_upload`	Upload File	Upload files via file input or file chooser
`explorer_handle_dialog`	Handle Dialog	Accept or dismiss alert/confirm/prompt dialogs
`explorer_navigate_back`	Navigate Back	Go back in browser history
`explorer_network_requests`	Get Network Requests	Return observed network requests
`explorer_press_key`	Press Key	Press a keyboard key (Enter, Tab, etc.)
`explorer_snapshot`	Accessibility Snapshot	Capture accessibility tree with element refs (e1, e2, ...)
`explorer_tabs`	Manage Tabs	List, create, close, or select browser tabs
`explorer_install`	Install Browser	No-op (uses system Chrome/Edge)
`explorer_find`	Find Element	Find elements by text, position, or selector — returns a ref
`explorer_interact`	Interact with Element	Multi-action: click, type, hover, clear, select, focus, check, uncheck
`explorer_scrape_menu`	Scrape Menu Structure	Extract hierarchical navigation menus as JSON
`explorer_scrape_page`	Scrape Page Content	Extract forms, tables, and filters as structured JSON

File Bridge Commands Reference

The file bridge accepts all of the above tool commands (by alias) plus additional bridge-only commands:

Commands That Map to LM Tools

These bridge commands invoke the same tool code as the LM tools:

Bridge Command	Maps To LM Tool	Notes
`navigate`	`explorer_navigate`	`url` arg (or `target` as alias)
`click`	`explorer_click`	`target` arg maps to `text`
`hover`	`explorer_hover`	Same spatial targeting (above, below, etc.)
`typeText`	`explorer_type`	`into` arg maps to field identifier
`typeFromFile`	`explorer_type_from_file`
`pressKey`	`explorer_press_key`	`key` or `target` arg
`selectOption`	`explorer_select_option`	`comboBox` + `value` args
`fillForm`	`explorer_fill_form`
`screenshot`	`explorer_take_screenshot`	`filename` or `outputPath` args
`consoleMessages`	`explorer_console_messages`
`networkRequests`	`explorer_network_requests`
`handleDialog`	`explorer_handle_dialog`
`uploadFile`	`explorer_file_upload`
`evaluate`	`explorer_evaluate`	`expression` arg maps to `function`
`navigateBack`	`explorer_navigate_back`
`goBack`	`explorer_navigate_back`	Alias for `navigateBack`
`snapshot`	`explorer_snapshot`
`find`	`explorer_find`
`interact`	`explorer_interact`
`scrapeMenu`	`explorer_scrape_menu`
`scrapePage`	`explorer_scrape_page`
`drag` / `dragTo`	`explorer_drag`	`source` + `target` args
`close`	`explorer_close`
`closeBrowser`	`explorer_close`	Alias
`tabs`	`explorer_tabs`
`listTabs`	`explorer_tabs` (list)
`newTab`	`explorer_tabs` (new)
`closeTab`	`explorer_tabs` (close)
`selectTab`	`explorer_tabs` (select)
`waitForText`	`explorer_wait_for`
`waitForElement`	`explorer_wait_for_element`
`wait`	`explorer_wait_for`	`ms` or `time` arg
`resize`	`explorer_resize`
`resizeBrowser`	`explorer_resize`	Alias
`fullscreenBrowser`	`explorer_resize` (fullscreen)
`extract`	`explorer_extract`

Bridge-Only Commands (No LM Tool Equivalent)

These commands exist only in the file bridge and are implemented directly against BrowserManager / Playwright:

Bridge Command	Description
`launchBrowser`	Open a browser window (optionally navigate to a URL)
`scrollDown`	Scroll down (default 500px, configurable via `pixels` arg)
`scrollUp`	Scroll up
`scrollRight`	Scroll right (default 300px)
`scrollLeft`	Scroll left
`doubleClick`	Double-click an element by text or ref
`rightClick`	Right-click an element by text or ref
`refresh`	Reload the current page
`goForward`	Go forward in browser history
`switchWindow`	Switch to a tab by title match
`getPageInfo`	Get current URL and page title
`getPageContent`	Get raw HTML content (truncated to 50KB)
`getPageText`	Get visible text content (truncated to 50KB)
`getAccessibilityTree`	Get accessibility tree (delegates to snapshot tool)
`highlight`	Visually outline an element on the page
`getDialogText`	Get text from the current dialog
`startRecording`	Begin recording API-based actions (legacy)
`stopRecording`	Stop API recording and return generated Python script
`startStepRecorder`	Start browser session recording (optionally with `url`)
`stopStepRecorder`	Stop browser session and generate output
`restartExtensionHost`	Restart VS Code extension host (useful after VSIX install)
`runScript`	Execute a JSON automation script

How LM Tools and Bridge Commands Relate

Are the commands duplicated?

No. There is one set of 28 tool classes (in tools.ts). The LM tool registration and the bridge both point to the same instances:

LM Path: Copilot → vscode.lm.registerTool('explorer_navigate', NavigateTool) → NavigateTool.invoke()
Bridge Path: CLI → input.json → bridge → BRIDGE_TO_TOOL['navigate'] → toolInstances.navigate.invoke()

The bridge adds convenience aliases (e.g., goBack → navigateBack, closeBrowser → close) and bridge-only commands (scrolling, page info, recording) that don't need the full LM tool interface.

The Command Palette exposes 41 VS Code commands total. Of these, 27 (e.g., webcure.testNavigate) are test harnesses — they prompt for input via vscode.window.showInputBox and invoke the same tool instances, displaying results in the Output panel. The remaining commands handle recording, assertions, the API server, and script execution.

VS Code Commands (Command Palette)

All commands are accessible via Cmd+Shift+P (macOS) / Ctrl+Shift+P (Windows/Linux) under the WebCure category:

Command	What it does
WebCure: Navigate to URL	Prompts for a URL and navigates
WebCure: Click Element	Prompts for text/selector and clicks
WebCure: Hover Element	Hover over an element
WebCure: Type Text	Type into a field
WebCure: Type Text from File	Type file contents into a field
WebCure: Press Key	Press a keyboard key
WebCure: Take Screenshot	Save a screenshot
WebCure: Take Snapshot	Capture accessibility tree with refs
WebCure: Find Element	Find element by text/selector
WebCure: Interact with Element	Multi-action on an element by ref
WebCure: Select Option	Pick a dropdown value
WebCure: Fill Form	Fill multiple form fields
WebCure: Drag Element	Drag and drop
WebCure: Evaluate JavaScript	Run JS in page context
WebCure: Extract Text Content	Extract visible text
WebCure: Wait For Text	Wait for text on page
WebCure: Wait For Element	Wait for element state
WebCure: Resize Browser Window	Resize viewport
WebCure: Navigate Back	Go back in history
WebCure: Manage Tabs	List/create/close/select tabs
WebCure: Close Browser	Close the browser
WebCure: Get Console Messages	Show browser console output
WebCure: Get Network Requests	Show observed network requests
WebCure: Handle Dialog	Accept/dismiss a dialog
WebCure: Upload File	Upload files
WebCure: Scrape Menu/Navigation	Extract menu structure
WebCure: Scrape Page Structure	Extract forms/tables
WebCure: Tools Menu	Quick-pick menu of all tools
WebCure: Start API Server	Start the HTTP API on port 5678
WebCure: Stop API Server	Stop the HTTP API
WebCure: Record API Script (Legacy)	Begin recording API-based browser actions
WebCure: Stop API Script Recording	Stop and generate Python script
WebCure: Record Browser Session	Start session recording (choose output)
WebCure: Stop Browser Session	Stop session recording and save output
WebCure: Insert Wait Step	Insert a timed pause at a specific point
WebCure: Assert: Element	Pick assertion type, then click target element (`Cmd+Shift+A`)
WebCure: Assert: Page Title	Assert current page title matches
WebCure: Assert: Page URL	Assert current page URL (exact or contains)
WebCure: Assert: Element Count	Click element, then assert how many matching elements exist
WebCure: Assert: Full Page Snapshot	Assert page body contains specific text
WebCure: Run Script	Execute a JSON automation script

HTTP API Server

WebCure includes an HTTP API server for programmatic access from external scripts or tools.

Enable via settings (webcure.api.enabled: true) or start manually:

Cmd+Shift+P → WebCure: Start API Server

Endpoints:

# Invoke any command
curl -X POST http://localhost:5678/invoke \
  -H "Content-Type: application/json" \
  -d '{"command": "navigate", "args": {"url": "https://example.com"}}'

# List available tools
curl http://localhost:5678/tools

# Health check
curl http://localhost:5678/health

Browser Session Recording

WebCure can record every user interaction in the browser and produce human-readable documentation and/or a Playwright Python test script. Choose from three output modes when you start a recording.

Recording Modes

Command Palette → WebCure: Record Browser Session presents a mode picker:

Mode	What is produced
Markdown + Screenshots	A timestamped folder containing `Recording.md` plus a screenshot for each step. Great for test documentation, bug reports, and onboarding guides.
Python Test Script	A standalone Playwright Python script with self-healing locators. No screenshots folder is created.
Both	Markdown with screenshots and a Python script, saved together in the same session folder.

How It Works

Start recording: Command Palette → WebCure: Record Browser Session
Choose output mode (Markdown / Python / Both)
(Markdown/Both) Optionally enter a folder name (blank = auto-timestamp)
(Python/Both) Optionally enter a script filename (default test_recording.py)
(Python/Both) Optionally enter a default wait between steps in seconds (e.g. 1). If set, time.sleep(N) is added after each action in the generated script.
Optionally enter an initial URL (defaults to https://demo.testfire.net)
Interact normally — every click, form input, file upload, and Enter key press is captured
Stop recording: Command Palette → WebCure: Stop Browser Session, or simply close the browser window

What Gets Captured

User Action	Recorded As	Screenshot
Click a link or button	`Clicked on button 'Login'`	Taken after the page reacts to the click
Click a dropdown trigger	`Clicked on button 'Options'`	Captured via deferred `pointerdown`
Click a menu item	`Selected 'Edit' from 'Options' dropdown`	ARIA-aware with trigger label lookup
Select a dropdown option	`Selected menu item 'Orange'`	Deferred pointerdown for Radix Select
Change a `<select>`	`Selected 'Active' from 'Status'`	Taken after change
Type into a field	`Typed 'admin' into 'Username'`	Taken immediately (shows the typed value)
Type a password	`Typed '********' into 'Password'`	Password value is masked
Press Enter	`Pressed 'Enter' on 'Search'`	Taken after the page reacts
Upload a file	`Uploaded file to 'Resume'` (path captured via dialog)	Taken after file selected
Navigate to URL	`Performed 'navigate' on 'Navigated to https://...'`	Taken after page load
Close the browser	`Performed 'close' on 'Browser window closed'`	No screenshot (browser is gone)
Insert Wait Step	`Wait 2 seconds`	No screenshot (no UI change)

Output Structure

Markdown + Screenshots and Both modes create a timestamped folder:

WebCure_Steps_2026-03-09_22-13-00/
├── Recording.md          # Markdown log with all steps
├── test_recording.py     # Python script (Both mode only)
├── step_1.png
├── step_2.png
└── ...

Python only mode writes a single timestamped script directly to the workspace root:

test_recording_2026-03-09_22-13-00.py

The Markdown file contains structured entries like:

### Step 3

**Action:** Typed 'admin' into 'Username'

![Screenshot for Step 3](./step_3.png)

The Python script uses self-healing locators that try multiple strategies in confidence order (test ID → ID → ARIA role → link text → visible text → name → CSS → XPath):

# Step 3: Typed 'admin' into 'Username'
self_healing_fill(page, [
    {"strategy": "id", "value": "uid", "confidence": 0.95},
    {"strategy": "ariaLabel", "value": "Username", "confidence": 0.85},
    {"strategy": "css", "value": "input[name='uid']", "confidence": 0.6},
], "admin")

Assertions

During a recording session, you can insert assertions to verify the state of the page or specific elements. Assertions are recorded as steps and generate corresponding Python assertion calls in the output script.

Quick Start

Start a recording — Command Palette → WebCure: Record Browser Session (Python or Both mode)
Interact normally — click, type, navigate
Insert an assertion — press Cmd+Shift+A (macOS) / Ctrl+Shift+A, pick the assertion type, then click the target element
Stop recording — the generated Python script includes assertion calls with pass/fail logging

Element Assertions

Press Cmd+Shift+A or Command Palette → Assert: Element to see a QuickPick menu:

Assertion	What It Verifies	Example Use Case
Visible	Element is visible on the page	Verify a success message appeared after form submission
Not Visible	Element is NOT visible	Verify an error banner disappeared after correction
Text	Element contains specific text	Verify a heading shows "Welcome, Admin" after login
Value	Input/select has a specific value	Verify a form field retained its value after page reload
Checked	Checkbox/radio is checked	Verify "Remember me" checkbox is checked by default
Not Checked	Checkbox/radio is NOT checked	Verify a terms checkbox starts unchecked
Enabled	Element is enabled (clickable)	Verify the submit button is active after filling required fields
Disabled	Element is disabled	Verify the submit button is greyed out before filling the form
Attribute	Element attribute has expected value	Verify a link's `href` attribute points to the correct URL

After picking the type, click the target element on the page. WebCure captures the element's locators and current state, then records the assertion step.

Page-Level Assertions

These don't require clicking an element:

Command	What It Verifies	Example Use Case
Assert: Page Title	`document.title` matches exactly	Verify the page title changes to "Dashboard" after login
Assert: Page URL	URL matches (exact or contains)	Verify redirect to `/dashboard` after successful login
Assert: Element Count	Number of matching elements	Verify exactly 5 search results are displayed
Assert: Full Page Snapshot	Page body contains specific text	Verify "Sign Off" link appears somewhere on the page

Pass/Fail Logging

Generated Python scripts include a built-in logging system:

Each step is wrapped in try/except with _record_step() for pass/fail tracking
Terminal output shows ✅/❌ icons with step descriptions in real time
Log file is saved as test_results_YYYYMMDD_HHMMSS.log
Summary prints at the end: TEST SUMMARY: 34/35 steps passed, 1 failed
Exit code is 1 if any step failed, 0 if all passed

Example output:

✅  Step 1: Navigate to https://example.com — PASS
✅  Step 2: Assert page title is 'Example' — PASS
✅  Step 3: Typed 'admin' into username — PASS
❌  Step 4: Assert heading text contains 'Welcome' — FAIL  [Expected text 'Welcome' in 'Hello']

============================================================
TEST SUMMARY: 3/4 steps passed, 1 failed
============================================================
Failed steps:
  Step 4: Assert heading text contains 'Welcome'
    Error: Expected text 'Welcome' in 'Hello'
Full log saved to: test_results_20260329_160005.log
============================================================

Generated Python Code

Assertions use the same self-healing locator system as action steps:

# Step 5: Assert heading text contains 'Hello Admin User'
assert_element_text(page, [
    {"strategy": "css", "value": "h1", "confidence": 0.8},
    {"strategy": "xpath", "value": "//h1", "confidence": 0.5},
], "Hello Admin User")

# Step 6: Assert page URL contains 'dashboard'
assert_page_url(page, "dashboard", "contains")

# Step 7: Assert exactly 2 checkboxes exist
assert_element_count(page, [
    {"strategy": "css", "value": "input[type='checkbox']", "confidence": 0.9},
], 2)

Inserting Sleep Steps

To insert an explicit pause at a specific point during recording: Command Palette → WebCure: Insert Wait Step. You will be prompted for a duration in seconds. This adds a time.sleep(N) call at that exact position in the generated Python script.

For a uniform wait after every step, use the default wait between steps option at recording start instead.

File Upload Recording

When the browser opens a native file-chooser dialog, WebCure:

Intercepts it automatically
Prompts you to enter the file path (or leave blank to record a placeholder)
Records the upload step with captured locators

The generated Python script emits an upload_file() call with self-healing locators, or a # TODO comment if no path was provided.

Element Identification

The step recorder uses multiple heuristics to produce human-readable element names:

<label for="..."> associations — preferred for form fields
Parent <label> wrappers — for fields wrapped inside labels
Adjacent table cell text — for table-based layouts (e.g., "Username:" from a neighboring <td>)
Previous sibling text — <span>, <label>, <b> elements before the input
Button value/text — for <button> and <input type="submit">, the button's own text takes priority
ARIA attributes — aria-label, title, placeholder
ARIA roles — role="menuitem", role="option", etc. with parent menu trigger lookup via aria-labelledby
Element ID or name — as a fallback

Each step also records CSS selector and XPath in HTML comments for reference.

Radix UI / Portal-Based Component Support

The step recorder handles modern component libraries (Radix UI, shadcn/ui, etc.) that use pointerdown events and portal-rendered menus:

Dropdown triggers: Captured via a "deferred pointerdown" strategy — the recorder captures every pointerdown on interactive elements, waits 400ms, and records it as a click only if no matching click event follows. This handles Radix UI DropdownMenu triggers that open a portal on pointerdown, causing the click to land on document.body.
Select options: Radix UI Select options (role="option") are removed from the DOM between pointerdown and mouseup, so the click event never fires on the option. The deferred pointerdown timer catches these and records the selection.
Menu items: Elements with role="menuitem" (or data-slot="dropdown-menu-item" / data-slot="select-item") are detected and the recorder walks up the DOM to find the parent [role="menu"] or [role="listbox"] container, then resolves the trigger label via aria-labelledby.

CLI Access

The step recorder can also be started and stopped via the CLI, enabling AI agents to trigger recordings programmatically:

# Markdown mode (default)
node .webcure/cli.js startStepRecorder url=https://example.com
# Python mode with 1-second wait between steps
node .webcure/cli.js startStepRecorder url=https://example.com mode=python defaultWaitSeconds=1
# ... interact with the browser ...
node .webcure/cli.js stopStepRecorder

Automatic Stop on Browser Close

If you close the browser window while recording is active, the recorder:

Logs a final "Browser window closed" step (without a screenshot)
Waits for any pending steps to finish writing
Automatically stops recording and opens the Markdown preview

API Script Recording & Python Playback (Legacy)

WebCure can record your browser actions and generate a Python script that replays them via the API server. This is the original recording method — for most use cases, Browser Session Recording is recommended instead.

Step 1: Install the Python Client

The webcure Python package lives in the python/ directory of this repository. Install it once:

pip install /path/to/webcure/python

Or in development/editable mode:

pip install -e /path/to/webcure/python

This provides module-level convenience functions (navigate, click, type_text, etc.) that the generated scripts import.

Step 2: Record Actions

Open the Command Palette (Cmd+Shift+P)
Run WebCure: Record API Script (Legacy)
Perform browser actions — navigate, click, type, resize, etc. using any WebCure command (Command Palette or Copilot)
Run WebCure: Stop API Script Recording (works even if the browser was closed during the session)

All actions are logged in the WebCure Tools Output channel. Start and stop recording events are also logged there with timestamps.

A new Python script opens in your editor with the recorded actions.

Step 3: Run the Generated Script

The script needs the WebCure API server running to execute commands:

Start the API server — Command Palette → WebCure: Start API Server (or it auto-starts when you stop recording)
Run the script:

python recording.py

Example Output

#!/usr/bin/env python3
# Auto-generated by WebCure

from webcure import click, close_browser, navigate, resize_browser, type_text

navigate("https://demo.testfire.net")
resize_browser("fullscreen")
click("ONLINE BANKING LOGIN")
type_text("admin", into="#uid")
type_text("admin", into="#passw")
click("#login > table > tbody > tr:nth-child(3) > td:nth-child(2) > input[type=submit]")
click("#btnGetAccount")
close_browser()

The Python client automatically detects whether a target is a CSS selector (e.g., #uid, .class, div > span) or visible text (e.g., "ONLINE BANKING LOGIN") and routes it to the correct Playwright locator strategy.

Python Client API

The webcure package also supports class-based usage:

from webcure import WebCure

wc = WebCure(port=5678)
wc.invoke("navigate", {"url": "https://example.com"})
print(wc.health())   # True if API server is running
print(wc.tools())    # List available tool names

To change the default port for module-level functions:

import webcure
webcure.set_port(9999)
webcure.navigate("https://example.com")

JSON Script Runner

Execute multi-step automation scripts with variables, capture patterns, and retry logic.

Script Format

{
  "name": "Login and verify dashboard",
  "stopOnError": true,
  "retries": 1,
  "retryDelay": 2000,
  "variables": {
    "baseUrl": "https://example.com",
    "username": "admin"
  },
  "steps": [
    {
      "command": "navigate",
      "args": { "url": "${baseUrl}/login" }
    },
    {
      "command": "typeText",
      "args": { "text": "${username}", "into": "Username" }
    },
    {
      "command": "click",
      "args": { "target": "Sign In" }
    },
    {
      "command": "find",
      "args": { "text": "Dashboard" },
      "captureRef": "dashRef"
    },
    {
      "command": "interact",
      "args": { "action": "click", "ref": "${dashRef}" }
    }
  ]
}

Running Scripts

Command Palette: Cmd+Shift+P → WebCure: Run Script → select a .json file
CLI: node .webcure/cli.js runScript file=/path/to/script.json

Features

Feature	Description
Variables	Define at script level, use `${varName}` in any step arg
captureRef	Auto-extract `[ref=eN]` from step output into a variable
capturePattern	Regex with capture group to extract values from output text
captureValue	Map `{ variableName: "property.path" }` to extract structured values
Retries	Script-level `retries` + `retryDelay` with per-step overrides
stopOnError	Continue or halt on failure (default: true, per-step override)
Command Aliases	Both `camelCase` and `snake_case` accepted

Configuration

Open Settings (Cmd+, / Ctrl+,) and search for webcure:

Setting	Default	Description
`webcure.api.enabled`	`false`	Enable the HTTP API server on activation
`webcure.api.port`	`5678`	Port for the HTTP API server
`webcure.api.host`	`127.0.0.1`	Host address for the HTTP API server
`webcure.bridge.enabled`	`true`	Enable the file bridge for AI agent integration

Environment Variables

Variable	Default	Description
`WEBEXPLORER_BROWSER`	(Chrome)	Set to `msedge` to use Microsoft Edge instead of Chrome

Development

Build from Source

cd ~/Developer/webcure

# Install dependencies
npm install

# Compile TypeScript
npm run compile

# Watch mode (recompile on file changes)
npm run watch

# Package into .vsix
npm run package

Run in Extension Development Host

Open the webcure/ folder in VS Code
Press F5
A new VS Code window opens with the extension loaded
Open the Command Palette and type "WebCure" to test commands

Project Structure

webcure/
├── src/
│   ├── extension.ts          # Entry point: registers tools, bridge, API, commands
│   ├── tools.ts              # 28 Language Model Tool classes
│   ├── browserManager.ts     # Playwright-core browser singleton
│   ├── apiServer.ts          # HTTP API server
│   ├── constants.ts          # Bridge directory/file names
│   ├── types.ts              # Shared types
│   ├── bridge/
│   │   ├── file-bridge.ts    # File-based command router
│   │   └── cli-template.js   # CLI helper (copied to .webcure/)
│   └── recorder/
│       ├── action-log.ts       # Start/stop/record actions
│       ├── script-generator.ts # Convert actions to Python
│       ├── step-recorder.ts   # Step recorder (Markdown + Python + assertions)
│       └── element-rules-engine.ts  # W3C/ARIA element classification engine
├── python/
│   ├── pyproject.toml        # Python package metadata
│   ├── setup.py              # Package setup (pip install)
│   └── webcure/
│       ├── __init__.py       # Convenience functions (navigate, click, etc.)
│       └── client.py         # WebCure API client class
├── tests/
│   ├── MANUAL-TEST-RESULTS.md  # Manual test documentation
│   ├── bridge-integration-tests.sh  # Automated bridge integration tests
│   ├── unit/
│   │   ├── tools.test.ts     # Unit tests (bridge routing, recording, params)
│   │   └── element-rules-engine.test.ts  # 113 unit tests for the rules engine
│   └── integration/
│       ├── live_engine_test.py  # 63 Python live browser tests against real sites
│       ├── test_assertions.py   # 46 assertion helper integration tests
│       ├── test_recorded_assertions.py  # 35-step end-to-end recorded-style test
│       └── screenshots/         # Test screenshots captured during live tests
├── status/
│   ├── project_status_01.md  # Initial release status report
│   ├── project_status_02.md  # Recording fix & Python package
│   ├── project_status_03.md  # Action persistence & interact tool fixes
│   ├── project_status_04.md  # Step recorder feature
│   ├── project_status_05.md  # Radix UI fixes & CLI step recorder
│   ├── project_status_06.md  # Deferred pointerdown & Select support
│   ├── project_status_07.md  # Element rules engine
│   ├── project_status_08.md  # Python test script generation
│   └── project_status_09.md  # Assertion recording & pass/fail logging
├── out/                      # Compiled JavaScript (tsc output)
├── package.json              # Extension manifest + tool/command declarations
├── tsconfig.json             # TypeScript configuration
└── README.md

Testing

Test Dependencies

# TypeScript unit tests (playwright-core + tsx)
npm install                    # installs playwright-core, tsx, typescript from package.json

# Python live browser integration tests
pip install playwright         # Python Playwright bindings (v1.58+)

# Browser binaries (shared by both TS and Python tests)
npx playwright install chromium

Running Tests

# TypeScript unit tests — 113 tests for the Element Rules Engine
npx tsx tests/unit/element-rules-engine.test.ts

# Bridge routing + parameter transformation unit tests
npm run test:unit

# Python live browser integration tests — 63 tests against real websites
# (demo.testfire.net, the-internet.herokuapp.com, Radix UI Themes Playground, W3C WAI-ARIA)
python3 tests/integration/live_engine_test.py

# Assertion helper integration tests — 46 tests against live websites
python3 tests/integration/test_assertions.py

# End-to-end recorded-style assertion test — 35 steps with pass/fail logging
python3 tests/integration/test_recorded_assertions.py

# Automated bridge integration tests (requires VS Code + extension active)
bash tests/bridge-integration-tests.sh

Troubleshooting

Extension doesn't activate

Make sure VS Code version is 1.95 or later (required for vscode.lm.registerTool)
Check the Output panel → select "WebCure Tools" for error messages

Browser doesn't launch

WebCure uses playwright-core which connects to your system Chrome (not a bundled Chromium)
Make sure Google Chrome or Microsoft Edge is installed
To use Edge: set environment variable WEBEXPLORER_BROWSER=msedge

Copilot doesn't use the tools

Ensure GitHub Copilot is active and you have a Copilot subscription
The tools should appear when you type #explorer_ in Copilot chat
Try asking Copilot explicitly: "Use explorer_navigate to go to https://example.com"

File bridge doesn't respond

Check that webcure.bridge.enabled is true in settings
Verify .webcure/ directory exists in your workspace root
Check that input.json is being created (the CLI writes it)
Look at the Output panel → "WebCure Tools" for errors

.vsix build fails

Run npm install first to ensure all dependencies are installed
If vsce is not found: npx @vscode/vsce package

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
python		python
src		src
status		status
tests		tests
.gitignore		.gitignore
.vscodeignore		.vscodeignore
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Folders and files

Latest commit

History

Repository files navigation

WebCure

Table of Contents

Architecture Overview

Prerequisites

Installation (Step by Step)

Step 1: Get the source code

Step 2: Install dependencies

Step 3: Compile the TypeScript

Step 4: Package into a .vsix file

Step 5: Install in your editor

Step 6: Verify the installation

Quick Start: VS Code Copilot (Language Model Tools)

Quick Start: Cursor / AI Agents (File Bridge + CLI)

How It Works

CLI Examples

Can LM Tools Be Used Through input.json?

Language Model Tools Reference

File Bridge Commands Reference

Commands That Map to LM Tools

Bridge-Only Commands (No LM Tool Equivalent)

How LM Tools and Bridge Commands Relate

VS Code Commands (Command Palette)

HTTP API Server

Browser Session Recording

Recording Modes

How It Works

What Gets Captured

Output Structure

Assertions

Quick Start

Element Assertions

Page-Level Assertions

Pass/Fail Logging

Generated Python Code

Inserting Sleep Steps

File Upload Recording

Element Identification

Radix UI / Portal-Based Component Support

CLI Access

Automatic Stop on Browser Close

API Script Recording & Python Playback (Legacy)

Step 1: Install the Python Client

Step 2: Record Actions

Step 3: Run the Generated Script

Example Output

Python Client API

JSON Script Runner

Script Format

Running Scripts

Features

Configuration

Environment Variables

Development

Build from Source

Run in Extension Development Host

Project Structure

Testing

Test Dependencies

Running Tests

Troubleshooting

Extension doesn't activate

Browser doesn't launch

Copilot doesn't use the tools

File bridge doesn't respond

.vsix build fails

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Packages