Skip to content

Commit fc3d200

Browse files
committed
feat: Add Amazon Bedrock AgentCore browser support with browser signing
Add comprehensive AgentCore browser integration with browser signing enabled: Core Features: - AgentCoreComputer class with CDP connection to AgentCore Browser - Browser signing enabled for cryptographic agent identification - Session recording to S3 with configurable bucket and prefix - Auto-created IAM roles with minimal scoped permissions - Configurable browser identifier via CLI, env var, or default - Region detection with fallback chain (param > env > boto > default) Browser Management: - Browser reuse logic to avoid duplicate browser creation - Dynamic browser ID generation for recording configurations - Robust error handling and resource cleanup in session lifecycle - Support for custom browser recording with unique identifiers Browser Signing: - Enables HTTP message signatures for web bot authentication - Helps reduce CAPTCHAs when AI agents browse the web - Cryptographically signs requests to identify as AI agent IAM & Security: - Browser identifier included in role names for clarity - Bucket/prefix-based policy names for readability - Always update IAM policy to match current bucket/prefix - Trust policy scoped to bedrock-agentcore.amazonaws.com - S3 permissions: PutObject, ListMultipartUploadParts, AbortMultipartUpload CLI Arguments: - --env=agentcore: Use AgentCore browser backend - --recording_bucket: S3 bucket for session recordings - --recording_prefix: S3 prefix for recordings (default: "recordings") - --execution_role_arn: IAM role ARN for browser execution - --create_execution_role: Auto-create IAM role if needed - --browser_identifier: Browser identifier (default: "aws.browser.v1") 🤖 Assisted by Amazon Q Developer
1 parent c0e6049 commit fc3d200

File tree

8 files changed

+512
-3
lines changed

8 files changed

+512
-3
lines changed

README.md

Lines changed: 42 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -94,6 +94,7 @@ You can specify a particular environment with the ```--env <environment>``` flag
9494

9595
- `playwright`: Runs the browser locally using Playwright.
9696
- `browserbase`: Connects to a Browserbase instance.
97+
- `agentcore`: Connects to Amazon Bedrock AgentCore Browser.
9798

9899
**Local Playwright**
99100

@@ -117,6 +118,43 @@ Runs the agent using Browserbase as the browser backend. Ensure the proper Brows
117118
python main.py --query="Go to Google and type 'Hello World' into the search bar" --env="browserbase"
118119
```
119120

121+
**Amazon Bedrock AgentCore**
122+
123+
Runs the agent using Amazon Bedrock AgentCore Browser as the backend. Requires AWS credentials configured and the `bedrock-agentcore` Python package installed.
124+
125+
```bash
126+
python main.py --query="Search for great deals on Alexa devices" --env="agentcore"
127+
```
128+
129+
The AWS region is automatically detected from your AWS configuration (environment variables, ~/.aws/config, or IAM role). You can override it by setting:
130+
131+
```bash
132+
export AWS_REGION="us-east-1"
133+
```
134+
135+
**Session Recording (AgentCore only)**
136+
137+
Enable session recording to S3 for replay and debugging:
138+
139+
```bash
140+
# Auto-create IAM role (recommended)
141+
python main.py --query="Search for great deals on Alexa devices" --env="agentcore" \
142+
--recording_bucket="my-recordings-bucket" \
143+
--create_execution_role
144+
145+
# Or provide existing role
146+
python main.py --query="Search for great deals on Alexa devices" --env="agentcore" \
147+
--recording_bucket="my-recordings-bucket" \
148+
--recording_prefix="sessions" \
149+
--execution_role_arn="arn:aws:iam::123456789012:role/AgentCoreRecordingRole"
150+
```
151+
152+
The auto-created role is scoped to the specified S3 bucket/prefix with minimal permissions:
153+
- Trust policy: `bedrock-agentcore.amazonaws.com`
154+
- S3 permissions: `s3:PutObject`, `s3:ListMultipartUploadParts`, `s3:AbortMultipartUpload`
155+
156+
Recordings can be viewed using the AgentCore session replay viewer.
157+
120158
## Agent CLI
121159

122160
The `main.py` script is the command-line interface (CLI) for running the browser agent.
@@ -126,9 +164,11 @@ The `main.py` script is the command-line interface (CLI) for running the browser
126164
| Argument | Description | Required | Default | Supported Environment(s) |
127165
|-|-|-|-|-|
128166
| `--query` | The natural language query for the browser agent to execute. | Yes | N/A | All |
129-
| `--env` | The computer use environment to use. Must be one of the following: `playwright`, or `browserbase` | No | N/A | All |
167+
| `--env` | The computer use environment to use. Must be one of the following: `playwright`, `browserbase`, or `agentcore` | No | playwright | All |
130168
| `--initial_url` | The initial URL to load when the browser starts. | No | https://www.google.com | All |
131169
| `--highlight_mouse` | If specified, the agent will attempt to highlight the mouse cursor's position in the screenshots. This is useful for visual debugging. | No | False (not highlighted) | `playwright` |
170+
| `--recording_bucket` | S3 bucket name for session recording (bucket name only, not ARN). Example: `my-recordings-bucket` | No | None | `agentcore` |
171+
| `--recording_prefix` | S3 prefix for session recordings. | No | recordings | `agentcore` |
132172

133173
### Environment Variables
134174

@@ -137,3 +177,4 @@ The `main.py` script is the command-line interface (CLI) for running the browser
137177
| GEMINI_API_KEY | Your API key for the Gemini model. | Yes |
138178
| BROWSERBASE_API_KEY | Your API key for Browserbase. | Yes (when using the browserbase environment) |
139179
| BROWSERBASE_PROJECT_ID | Your Project ID for Browserbase. | Yes (when using the browserbase environment) |
180+
| AWS_REGION | AWS region for AgentCore Browser. | No (auto-detected from AWS config when using agentcore environment) |

computers/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,10 +14,12 @@
1414
from .computer import Computer, EnvState
1515
from .browserbase.browserbase import BrowserbaseComputer
1616
from .playwright.playwright import PlaywrightComputer
17+
from .agentcore.agentcore import AgentCoreComputer
1718

1819
__all__ = [
1920
"Computer",
2021
"EnvState",
2122
"BrowserbaseComputer",
2223
"PlaywrightComputer",
24+
"AgentCoreComputer",
2325
]

computers/agentcore/__init__.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
from .agentcore import AgentCoreComputer
2+
3+
__all__ = ["AgentCoreComputer"]

computers/agentcore/agentcore.py

Lines changed: 149 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,149 @@
1+
import os
2+
3+
import termcolor
4+
from playwright.sync_api import sync_playwright
5+
6+
from ..playwright.playwright import PlaywrightComputer
7+
from . import utils
8+
9+
10+
class AgentCoreComputer(PlaywrightComputer):
11+
"""Connects to Amazon Bedrock AgentCore Browser via CDP.
12+
13+
Supports optional session recording to S3 for replay and debugging.
14+
"""
15+
16+
def __init__(
17+
self,
18+
screen_size: tuple[int, int],
19+
initial_url: str = "https://www.google.com",
20+
recording_bucket: str | None = None,
21+
recording_prefix: str = "recordings",
22+
execution_role_arn: str | None = None,
23+
create_execution_role: bool = False,
24+
browser_identifier: str | None = None,
25+
region: str | None = None,
26+
):
27+
from boto3.session import Session
28+
29+
super().__init__(screen_size, initial_url)
30+
self._recording_bucket: str | None = recording_bucket
31+
self._recording_prefix: str = recording_prefix
32+
self._execution_role_arn: str | None = execution_role_arn
33+
self._create_execution_role: bool = create_execution_role
34+
self._browser_identifier: str = (
35+
browser_identifier or
36+
os.getenv("AGENTCORE_BROWSER_IDENTIFIER", "aws.browser.v1")
37+
)
38+
# Determine region with fallback chain
39+
boto_region = Session().region_name
40+
self._region: str = (
41+
region
42+
or os.getenv("AGENTCORE_REGION")
43+
or os.getenv("AWS_REGION")
44+
or (boto_region if isinstance(boto_region, str) else None)
45+
or "us-west-2"
46+
)
47+
self._created_browser: bool = False
48+
self._client = None
49+
50+
def __enter__(self):
51+
from bedrock_agentcore.tools.browser_client import BrowserClient
52+
53+
print("Creating AgentCore browser session...")
54+
55+
region = self._region
56+
57+
# Create browser with recording if bucket specified
58+
browser_identifier_to_use = self._browser_identifier
59+
if self._recording_bucket:
60+
# If browser_identifier is already a browser ID (starts with "br-"), use it directly
61+
if self._browser_identifier.startswith("br-"):
62+
termcolor.cprint(
63+
f"Using provided browser ID: {self._browser_identifier}",
64+
color="cyan"
65+
)
66+
browser_identifier_to_use = self._browser_identifier
67+
else:
68+
# Create a unique browser name based on the bucket and prefix
69+
# This ensures each recording configuration gets its own browser
70+
import hashlib
71+
config_hash = hashlib.sha256(
72+
f"{self._recording_bucket}/{self._recording_prefix}".encode()
73+
).hexdigest()[:8]
74+
browser_name = f"recording_{config_hash}"
75+
76+
self._execution_role_arn, browser_id = utils.setup_browser_recording(
77+
browser_name,
78+
self._browser_identifier,
79+
self._recording_bucket,
80+
self._recording_prefix,
81+
self._execution_role_arn,
82+
self._create_execution_role,
83+
region
84+
)
85+
# Use the custom browser ID instead of the original identifier
86+
browser_identifier_to_use = browser_id
87+
88+
self._client = BrowserClient(region)
89+
90+
session_id = self._client.start(
91+
identifier=browser_identifier_to_use,
92+
name="gemini-browser-session"
93+
)
94+
print(f"AgentCore browser session started: {session_id}")
95+
96+
ws_url, headers = self._client.generate_ws_headers()
97+
98+
self._playwright = sync_playwright().start()
99+
self._browser = self._playwright.chromium.connect_over_cdp(
100+
ws_url,
101+
headers=headers
102+
)
103+
self._context = self._browser.contexts[0]
104+
self._page = self._context.pages[0]
105+
106+
# Set viewport explicitly (CDP connection doesn't inherit from session config)
107+
self._page.set_viewport_size({
108+
"width": self._screen_size[0],
109+
"height": self._screen_size[1]
110+
})
111+
112+
self._page.goto(self._initial_url)
113+
114+
self._context.on("page", self._handle_new_page)
115+
116+
termcolor.cprint(
117+
f"AgentCore browser session started in {region}",
118+
color="green",
119+
attrs=["bold"],
120+
)
121+
122+
return self
123+
124+
def __exit__(self, exc_type, exc_val, exc_tb):
125+
# Clean up in reverse order, with error handling for each step
126+
try:
127+
if self._page:
128+
self._page.close()
129+
130+
if self._context:
131+
self._context.close()
132+
133+
if self._browser:
134+
self._browser.close()
135+
finally:
136+
try:
137+
if self._client:
138+
_ = self._client.stop()
139+
finally:
140+
try:
141+
if self._playwright:
142+
self._playwright.stop()
143+
finally:
144+
termcolor.cprint(
145+
"AgentCore browser session stopped",
146+
color="green",
147+
attrs=["bold"],
148+
)
149+

0 commit comments

Comments
 (0)