fix: synchronize ACP telemetry and refresh remote final state by simonrosenberg · Pull Request #2460 · OpenHands/software-agent-sdk

simonrosenberg · 2026-03-16T13:03:13Z

Fixes #2375

This implements the fix direction from the issue discussion:

move ACP telemetry writes onto a single synchronized path in ACPAgent
stop mutating metrics directly from session_update()
wait for the turn's UsageUpdate before recording cost/tokens/latency
refresh the authoritative remote conversation state before run() returns
keep event reconciliation for history completeness after the final state refresh

Why:

Latest zero-cost ACP benchmark rows were caused by two separate correctness problems:

ACP telemetry was split across notification handling and prompt response handling.
RemoteConversation could return from REST fallback with stale cached state, leaving conversation_stats at zero even when the server had final stats.

Tests:

PYTHONPATH=/tmp/sdk-issue-2375/openhands-sdk${PYTHONPATH:+:$PYTHONPATH} pytest tests/sdk/agent/test_acp_agent.py tests/sdk/conversation/remote/test_remote_conversation.py

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.13-nodejs22`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:d61f7eb-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-d61f7eb-python \
  ghcr.io/openhands/agent-server:d61f7eb-python

All tags pushed for this build

ghcr.io/openhands/agent-server:d61f7eb-golang-amd64
ghcr.io/openhands/agent-server:d61f7eb-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:d61f7eb-golang-arm64
ghcr.io/openhands/agent-server:d61f7eb-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:d61f7eb-java-amd64
ghcr.io/openhands/agent-server:d61f7eb-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:d61f7eb-java-arm64
ghcr.io/openhands/agent-server:d61f7eb-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:d61f7eb-python-amd64
ghcr.io/openhands/agent-server:d61f7eb-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-amd64
ghcr.io/openhands/agent-server:d61f7eb-python-arm64
ghcr.io/openhands/agent-server:d61f7eb-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-arm64
ghcr.io/openhands/agent-server:d61f7eb-golang
ghcr.io/openhands/agent-server:d61f7eb-java
ghcr.io/openhands/agent-server:d61f7eb-python

About Multi-Architecture Support

Each variant tag (e.g., d61f7eb-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., d61f7eb-python-amd64) are also available if needed

github-actions · 2026-03-16T13:03:41Z

Python API breakage checks — ✅ PASSED

Result: ✅ PASSED

Action log

github-actions · 2026-03-16T13:03:48Z

REST API breakage checks (OpenAPI) — ❌ FAILED

Result: ❌ FAILED

⚠️ Breaking REST API changes or policy violations detected.

Log excerpt (first 1000 characters)

::error title=openhands-agent-server REST API::Breaking REST API change detected without MINOR version bump (1.14.0 -> 1.14.0).

Breaking REST API changes detected compared to baseline release:
- added '#/components/schemas/ACPAgent-Output, #/components/schemas/Agent-Output' to the '/items/anyOf[subschema #1: ConversationInfo]/agent' response property 'oneOf' list for the response status '200'
- the '/items/anyOf[subschema #1: ConversationInfo]/agent' response's property type/format changed from 'object'/'' to ''/'' for status '200'
- removed the required property '/items/anyOf[subschema #1: ConversationInfo]/agent/kind' from the response with the '200' status
- removed the required property '/items/anyOf[subschema #1: ConversationInfo]/agent/llm' from the response with the '200' status
- the 'agent' request property type/format changed from 'object'/'' to ''/''
- added '#/components/schemas/ACPAgent-Output, #/components/schemas/Agent-Output' to the 'agent' response property 'oneOf' li

Action log

all-hands-bot

Taste Rating: 🟡 Acceptable - Core telemetry fix is solid, but bundled breaking changes need attention

Verdict: ❌ Needs documentation - The synchronization fix is correct, but two undocumented breaking changes (retry removal + hook regression) should be called out in the PR description or split into separate PRs.

Key Insight: Moving cost recording to a single synchronized path after UsageUpdate receipt is the right fix for zero-cost telemetry. The per-session tracking is cleaner than global state. However, removing ~150 lines of retry logic and changing hook behavior are significant changes that deserve explicit justification.

all-hands-bot · 2026-03-16T13:07:01Z

openhands-sdk/openhands/sdk/agent/acp_agent.py

+# PromptResponse, so under normal conditions the notification handler
+# completes almost immediately. This timeout is a safety net for slow
+# or remote servers.
+_USAGE_UPDATE_TIMEOUT: float = float(


🔴 Critical - Undocumented Breaking Change: The PR completely removes connection retry logic (previously handled ConnectionError, BrokenPipeError, EOFError with 3 retries + exponential backoff). This timeout constant replacement is good, but the retry removal is undocumented.

Why this matters: Production ACP servers can have temporary network blips, container restarts, or connection resets. The old code tolerated these via retry. The new code will fail immediately.

What is needed: Either (1) explain in the PR description why retry is no longer needed, or (2) restore retry logic, or (3) split this removal into a separate PR with justification.

all-hands-bot · 2026-03-16T13:07:01Z

openhands-sdk/openhands/sdk/agent/acp_agent.py

            elapsed: Wall-clock seconds for this prompt round-trip (optional).
+            usage_update: The synchronized ACP UsageUpdate for this turn, if any.
        """
+        if usage_update is not None and usage_update.cost is not None:


🟢 Acceptable: Single telemetry recording path is exactly what was needed. Cost/tokens/latency all processed in one place after synchronization eliminates the split-brain problem described in the issue. Per-session cost tracking (_last_cost_by_session) correctly handles multiple concurrent sessions.

all-hands-bot · 2026-03-16T13:07:01Z

openhands-sdk/openhands/sdk/agent/acp_agent.py

                    self._session_id,
                )
-                await _drain_notifications()
+                if self._client.get_turn_usage_update(self._session_id or "") is None:


🟢 Acceptable - Good Taste: This synchronization approach is correct. Preparing the event before prompt(), waiting for UsageUpdate notification if not already received, then processing in a single path eliminates the race condition. The 2.0s timeout is reasonable (server writes UsageUpdate before PromptResponse per ACP protocol). Configurable via env var is pragmatic.

all-hands-bot · 2026-03-16T13:07:01Z

openhands-sdk/openhands/sdk/conversation/impl/remote_conversation.py

+                HookEventType.USER_PROMPT_SUBMIT,
+                HookEventType.STOP,
+            )
+            if any(hook_config.has_hooks_for_event(t) for t in unsupported):


🔴 Critical - Breaking Change: This changes hook behavior from "all hooks executed server-side" to "only SessionStart/SessionEnd supported locally, others emit warning but do not execute".

What breaks: Code relying on PRE_TOOL_USE, POST_TOOL_USE, USER_PROMPT_SUBMIT, or STOP hooks will silently stop working (just a warning in logs).

Why risky: This is a functional regression unrelated to the telemetry fix. It should be: (1) Documented in the PR description as a breaking change, OR (2) Split into a separate PR with justification for the regression.

all-hands-bot

Additional feedback on test coverage:

all-hands-bot · 2026-03-16T13:07:17Z

tests/sdk/agent/test_acp_agent.py

-        mock_response = MagicMock()
-        mock_response.usage = None
+        assert len(agent.llm.metrics.token_usages) == 0



🟡 Suggestion - Test Gap: The PR tests when UsageUpdate is completely absent (cost=None in fixture), but not when it times out after 2.0s. Consider adding a test that verifies the timeout path:

def test_step_records_partial_metrics_on_usage_timeout(self, tmp_path): """Timeout waiting for UsageUpdate logs warning but records available token metrics.""" # Setup: executor returns response but never populates turn_usage_updates # Assert: warning logged, token_usages recorded from PromptResponse, but costs remain at 0

This would verify graceful degradation when the ACP server is slow/buggy and UsageUpdate never arrives before timeout.

simonrosenberg · 2026-03-16T13:43:44Z

Addressed the remaining review items:

restored ACP prompt retry behavior for transient connection errors
reverted the unrelated RemoteConversation hook behavior change so hooks remain server-side
added a timeout-path ACP telemetry test to verify graceful degradation when UsageUpdate does not arrive in time

Verification:

PYTHONPATH=/tmp/sdk-issue-2375/openhands-sdk${PYTHONPATH:+:$PYTHONPATH} pytest tests/sdk/agent/test_acp_agent.py tests/sdk/conversation/remote/test_remote_conversation.py
result: 109 passed

fix: synchronize ACP telemetry and refresh remote final state

9f61937

all-hands-bot reviewed Mar 16, 2026

View reviewed changes

fix: restore ACP retries and keep remote hooks server-side

cd82b12

simonrosenberg mentioned this pull request Mar 16, 2026

Enable ACPAgent on RemoteRuntime API via v2 conversations contract #2461

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: synchronize ACP telemetry and refresh remote final state#2460

fix: synchronize ACP telemetry and refresh remote final state#2460
simonrosenberg wants to merge 2 commits intomainfrom
fix/issue-2375-acp-telemetry

simonrosenberg commented Mar 16, 2026 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Mar 16, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 16, 2026 •

edited

Loading

Uh oh!

all-hands-bot left a comment

Uh oh!

all-hands-bot Mar 16, 2026

Uh oh!

all-hands-bot Mar 16, 2026

Uh oh!

all-hands-bot Mar 16, 2026

Uh oh!

all-hands-bot Mar 16, 2026

Uh oh!

all-hands-bot left a comment

Uh oh!

all-hands-bot Mar 16, 2026

Uh oh!

simonrosenberg commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

simonrosenberg commented Mar 16, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Python API breakage checks — ✅ PASSED

Uh oh!

github-actions bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

REST API breakage checks (OpenAPI) — ❌ FAILED

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

all-hands-bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

all-hands-bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

all-hands-bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

all-hands-bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

all-hands-bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

simonrosenberg commented Mar 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

simonrosenberg commented Mar 16, 2026 •

edited by github-actions bot

Loading

github-actions bot commented Mar 16, 2026 •

edited

Loading

github-actions bot commented Mar 16, 2026 •

edited

Loading