Skip to content

fix(tools): return browser timeout as observation#2455

Draft
neubig wants to merge 3 commits intomainfrom
fix/browser-timeout-observation
Draft

fix(tools): return browser timeout as observation#2455
neubig wants to merge 3 commits intomainfrom
fix/browser-timeout-observation

Conversation

@neubig
Copy link
Contributor

@neubig neubig commented Mar 15, 2026

Summary

  • catch browser action timeouts in BrowserToolExecutor.__call__ and return a normal BrowserObservation error instead of bubbling up a fatal conversation TimeoutError
  • format empty browser exceptions so users do not see blank Browser operation failed: messages
  • add a regression test covering timeout-to-observation behavior

Checklist

  • If the PR is changing/adding functionality, are there tests to reflect this?
  • If there is an example, have you run the example to make sure that it works? (not applicable)
  • If there are instructions on how to run the code, have you followed the instructions and made sure that it works? (not applicable)
  • If the feature is significant enough to require documentation, is there a PR open on the OpenHands/docs repository with the same branch name? (not applicable)
  • Is the github CI passing? (not yet)

Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.13-nodejs22 Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:3054c30-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-3054c30-python \
  ghcr.io/openhands/agent-server:3054c30-python

All tags pushed for this build

ghcr.io/openhands/agent-server:3054c30-golang-amd64
ghcr.io/openhands/agent-server:3054c30-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:3054c30-golang-arm64
ghcr.io/openhands/agent-server:3054c30-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:3054c30-java-amd64
ghcr.io/openhands/agent-server:3054c30-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:3054c30-java-arm64
ghcr.io/openhands/agent-server:3054c30-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:3054c30-python-amd64
ghcr.io/openhands/agent-server:3054c30-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-amd64
ghcr.io/openhands/agent-server:3054c30-python-arm64
ghcr.io/openhands/agent-server:3054c30-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-arm64
ghcr.io/openhands/agent-server:3054c30-golang
ghcr.io/openhands/agent-server:3054c30-java
ghcr.io/openhands/agent-server:3054c30-python

About Multi-Architecture Support

  • Each variant tag (e.g., 3054c30-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 3054c30-python-amd64) are also available if needed

Catch browser action timeouts at the browser tool boundary so a hung browser call returns a BrowserObservation error instead of bubbling up as a fatal conversation TimeoutError.

Also format empty browser exceptions more clearly to avoid blank 'Browser operation failed:' messages.

Co-authored-by: openhands <openhands@all-hands.dev>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 15, 2026

Python API breakage checks — ✅ PASSED

Result:PASSED

Action log

@github-actions
Copy link
Contributor

github-actions bot commented Mar 15, 2026

REST API breakage checks (OpenAPI) — ✅ PASSED

Result:PASSED

Action log

neubig pushed a commit that referenced this pull request Mar 15, 2026
Bump openhands-agent-server from 1.14.0 to 1.15.0 so the REST API breakage workflow reflects the current breaking API surface already present on main.

This is intentionally separate from PR #2455, whose failing REST API check is unrelated to the browser timeout fix.

Co-authored-by: openhands <openhands@all-hands.dev>
@neubig
Copy link
Contributor Author

neubig commented Mar 16, 2026

@OpenHands test this life by creating a web service that does not respond in the allotted time and creating an example following the examples folder that asks the agent to query this web service. Run the example before and after this fix and demonstrate that it's better. Fix any failing tests as well

@openhands-ai
Copy link

openhands-ai bot commented Mar 16, 2026

I'm on it! neubig can track my progress at all-hands.dev

Copy link
Contributor Author

neubig commented Mar 16, 2026

Addressed in 4010efb.

What changed:

  • kept browser action timeouts on the tool path as normal BrowserObservation errors
  • added a live slow-service regression test in tests/tools/browser_use/test_browser_executor.py
  • added examples/01_standalone_sdk/45_browser_timeout_observation.py, a self-contained example that starts a slow web service and has an agent call browser_navigate against it
  • excluded the new example from the generic example test sweep because it intentionally runs its own live timeout scenario

Validation:

  • uv run pre-commit run --files openhands-tools/openhands/tools/browser_use/impl.py tests/tools/browser_use/test_browser_executor.py tests/examples/test_examples.py examples/01_standalone_sdk/45_browser_timeout_observation.py
  • uv run pytest tests/tools/browser_use/test_browser_executor.py -q
  • uv run pytest tests/examples/test_examples.py -q
  • uv run python examples/01_standalone_sdk/45_browser_timeout_observation.py

Before/after demo using the new example:

  • legacy behavior (monkeypatched old BrowserToolExecutor.__call__): the example aborts with a fatal ConversationRunError caused by TimeoutError
  • current behavior: the agent receives Browser operation failed: Operation timed out after 2 seconds, then cleanly finishes with The slow web service timed out, but the browser tool returned a normal error observation instead of crashing the conversation.

I also checked local targeted tests after the rebase; the new regression coverage is green.

@openhands-ai
Copy link

openhands-ai bot commented Mar 16, 2026

Done — the new changes since the last summary are complete and pushed.

Checklist:

  • Added a live slow-service regression test in tests/tools/browser_use/test_browser_executor.py that verifies a real timeout is returned as a normal BrowserObservation error.
  • Added a standalone, self-contained example at examples/01_standalone_sdk/45_browser_timeout_observation.py.
  • Demonstrated before/after behavior with that example:
    • before/flegacy behavior: fatal ConversationRunError from TimeoutError
    • after/current behavior: tool returns Browser operation failed: Operation timed out after 2 seconds, and the conversation finishes normally
  • Updated tests/examples/test_examples.py to exclude the new example from the generic example sweep since it intentionally runs a live timeout scenario.
  • Ran targeted validation and pre-commit successfully.
  • Committed and pushed to the PR branch.

Conciseness check:

  • Changes are focused to the reviewer request.
  • No extra files beyond the requested example, test coverage, and necessary test-list update.

Validation run:

  • uv run pre-commit run --files openhands-tools/openhands/tools/browser_use/impl.py tests/tools/browser_use/test_browser_executor.py tests/examples/test_examples.py examples/01_standalone_sdk/45_browser_timeout_observation.py
  • uv run pytest tests/tools/browser_use/test_browser_executor.py -q
  • uv run pytest tests/examples/test_examples.py -q
  • uv run python examples/01_standalone_sdk/45_browser_timeout_observation.py

Pushed commit:

  • 4010efb7a04159930d0587a8505d6602a1a1fa8f

PR update comment:

@github-actions
Copy link
Contributor

github-actions bot commented Mar 16, 2026

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-tools/openhands/tools/browser_use
   impl.py25715938%57–61, 63–64, 66, 68, 70–73, 75–76, 78, 80, 87, 99–103, 105, 109–110, 115, 117–119, 121–122, 130–132, 134–138, 143, 194, 199–202, 204, 226–228, 231–233, 235, 248, 285–286, 290, 300, 315–316, 321, 335–336, 341–342, 352, 370–371, 373–386, 389–402, 404–405, 411, 416–419, 427, 429, 432–433, 439–440, 445–446, 452–453, 457–458, 462–463, 467, 469–470, 472–475, 478–479, 485, 487, 489, 497–498, 502–503, 508–509, 513–514, 518–519, 524–525, 537–538, 549–550, 554–558, 572–574, 579, 584–585, 594–595
TOTAL19948997450% 

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants