feat(tools): add use_browser tool to Strands tools repository #88

jimbrub · 2025-06-16T13:38:28Z

Description

Adding a use_browser tool, that gives an agent actions, so they interact with browsers (just chromium right now). This tool relies on Playwright to perform automated actions on a browser. Playwright allows for a browser to be launched with different launch options. One of these launch options is persistent_context, which allows users to launch a browser and load in the saved data from whatever the specified user_data_dir is. This feature is used to save cookie settings, local storage, sessions storage, browser history, saved passwords, etc. Right now you can ask an agent to use the use_browser tool to launch a browser with persistent_context set to true and specify a file path for user_data_dir. Then, any usernames, bookmarks you create, or search history should be stored in the user_data_dir and then the browser can be relaunched at a later date.

Related Issues

(https://github.com/strands-agents/private-sdk-python-staging/issues/72)

Documentation PR

Will need to add documentation about use_browser tool.

Type of Change

New Tool
Other (please describe):

Testing

[How have you tested the change?]

hatch fmt --linter
hatch fmt --formatter
hatch test --all

Checklist

I have read the CONTRIBUTING document
I have added tests that prove my fix is effective or my feature works
I have updated the documentation accordingly
I have added an appropriate example to the documentation to outline the feature
My changes generate no new warnings
Any dependent changes have been merged and published
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

src/strands_tools/use_browser.py

Adding functionality so use_browser tool has some retry functionality and supports multiple tabs and switching between them.

src/strands_tools/use_browser.py

mehtarac · 2025-06-20T14:31:12Z

src/strands_tools/use_browser.py


-            elif action == "click":
-                result += await self._handle_click_action(page, args)
+    async def retry_action(self, action_func, max_retries=3, delay=1.0, action_name=None, args=None):


Let's make this configurable by environment variables. Along with the delay parameter.

src/strands_tools/use_browser.py

tests/test_use_browser.py

mehtarac

High level comment:

Let's add this tool to the README as well

jimbrub · 2025-06-20T17:06:12Z

High level comment:

Let's add this tool to the README as well

tests/test_use_browser.py

zastrowm · 2025-06-20T17:41:47Z

tests/test_use_browser.py

+def setup_test_environment():
+    """Fixture to set up common test environment"""
+    original_value = os.environ.get("BYPASS_TOOL_CONSENT", None)
+    os.environ["BYPASS_TOOL_CONSENT"] = "true"


Using mocking of the environment. See

tools/tests/test_environment.py

Line 29 in 44f5d23

def os_environment():

for example

I see, thank you for this. Autouse works well for the setup for the tests

…errors for better error handling

…actions are only called with their required arguments

zastrowm · 2025-06-23T18:23:45Z

pyproject.toml

@@ -38,6 +38,8 @@ dependencies = [
    "tenacity>=9.1.2,<10.0.0",
    "watchdog>=6.0.0,<7.0.0",
    "slack_bolt>=1.23.0,<2.0.0",
+    "nest-asyncio>=1.5.0,<2.0.0",
+    "playwright>=1.42.0,<2.0.0",


How heavy is this dependency?

I'm leaning towards this being an optional dependency as I don't want everyone to need to download playwright just to use memory for instance

I wasn't too sure on the acceptable way to add these dependencies. As long as adding them as optional dependencies keeps the same functionality with the tool, I would rather do it that way.

zastrowm · 2025-06-23T18:24:57Z

src/strands_tools/use_browser.py

+from strands_tools.utils.user_input import get_user_input
+
+# Only configure this module's logger, not the root logger
+enable_debug = os.getenv("ENABLE_DEBUG_BROWSER_LOGS", "false").lower() == "true"


What's our naming convention? I think use a STRANDS prefix elsewhere and I wonder if we should add that prefix

Would a better convention for this variable be "STRANDS_USE_BROWSER_ENABLE_DEBUG_LOGS"?

+1 to Mackenzie's point. It's a small thing but it'll be good to follow the naming convention that we have throughout the repo.

zastrowm · 2025-06-23T18:26:36Z

src/strands_tools/use_browser.py

+enable_debug = os.getenv("ENABLE_DEBUG_BROWSER_LOGS", "false").lower() == "true"
+
+logger = logging.getLogger(__name__)
+logger.setLevel(logging.DEBUG if enable_debug else logging.INFO)


This seems like a client-level concern, not a tool level concern. Any reason we shouldn't let the application/caller set up logging properly?

I was having an issue with the logs in the past printing out everything, I will re-look at this, logs should be configured on client-level, that makes sense.

zastrowm · 2025-06-23T20:46:32Z

src/strands_tools/use_browser.py

+_playwright_manager = None
+
+# Apply nested event loop support
+nest_asyncio.apply()


Can this be done lazily at all? I'm a little concerned that simply importing the tool changes the behavior

I think I can move this into the initialization code for the BrowserManager class

zastrowm · 2025-06-23T20:47:12Z

src/strands_tools/use_browser.py

+console = Console()
+
+# Global browser manager instance
+_playwright_manager = None


Is there a specific reason to have a global instance? If so, can that reason be added as a comment?

Are there any limitations around multiple agents interacting with this tool?

There is no reason that I have found to why it needs a global instance, let me test the tool with that line in the tool function

zastrowm · 2025-06-23T20:50:39Z

src/strands_tools/use_browser.py

+        self._loop = asyncio.new_event_loop()
+        asyncio.set_event_loop(self._loop)
+        self.action_configs = {
+            "navigate": {


Rather than config as code, would it make sense to go to a defined object?

class ApiMethods: def navigate(page: Page, url: str): page.wait_for_load_state("networkidle")) return "Navigated to {url}" ....

I think you'd be able to use reflection at runtime to build up the same data here, but it'd be a more declarative way of exposing actions

I like this approach as well. This way we get access to all the actions of the class in one go.

zastrowm · 2025-06-23T20:51:05Z

src/strands_tools/use_browser.py

+                "method": lambda page, selector: page.text_content(selector),
+                "required_args": ["page", "selector"],
+                "required_params": [("selector", str)],
+                "post_process": lambda result: result,


Is this actually needed?

Ahh, I see let me test it without that and see

zastrowm · 2025-06-23T20:52:06Z

src/strands_tools/use_browser.py

+        logger.debug("Ensuring browser is running...")
+
+        # Ensure required directories exist
+        user_data_dir = os.getenv("BROWSER_USER_DATA_DIR", os.path.join(os.path.expanduser("~"), ".browser_automation"))


Related to my earlier question - what's our convention with naming? Should these be prefixed with STRANDS_?

Are these parameters only required because we don't provide a way to configure tools otherwise? What if you implemented it such as:

class BrowserTool: def __init__(... params...): @tool def use_browser(...) ...

Usage would then be:

use_browser = BrowserTool(... params ...).use_browser

which would also allow you to isolate state.

(this may or may not work today; but after strands-agents/sdk-python#258 it should work). We may also want to ping the team to get thoughts on this class based approach for this case

I do like the idea of a class based approach here

…owser actions as methods

feat(tools): add use_browser tool to Strands tools repository

1ae6085

jimbrub requested a review from a team as a code owner June 16, 2025 13:38

test(use_browser): add more unit testing for use_browser tool

b3afd62

mehtarac reviewed Jun 17, 2025

View reviewed changes