Releases: OpenHands/software-agent-sdk
Releases · OpenHands/software-agent-sdk
v1.15.0
- Stronger API stability and upgrade safeguards, with clearer deprecation expectations and better protection against breaking changes.
- Expanded RemoteRuntime capabilities and reliability, including improved ACP-based workflows and more dependable remote state handling.
- New developer-facing SDK features, such as parallel tool execution, additional public APIs, richer plugin metadata, and better workspace credential inheritance.
- Broader model compatibility and better reasoning-model support, including newly verified models and fixes for provider-specific behavior.
- Faster, more reliable builds and CI, plus a range of bug fixes and dependency updates that improve day-to-day stability.
What's Changed
- Enable ACPAgent on RemoteRuntime API by @simonrosenberg in #2190
- Make API breakage workflows fail loudly by @enyst in #2432
- Clarify REST contract deprecation policy by @enyst in #2433
- Highlight API breakage check comments more clearly by @enyst in #2434
- Enforce REST deprecation deadlines by @enyst in #2435
- Remove stale Python API workflow env by @enyst in #2442
- Run API breakage checks on push to main by @enyst in #2443
- Add rich package version 14.3.3 to dependency constraints by @yitao-li in #2414
- Export TokenUsage, page_iterator, and AsyncRemoteWorkspace as public SDK APIs by @enyst in #2445
- Fix apptainer workspace cleanup: kill zombie child processes. by @adityasoni9998 in #2450
- chore(deps): bump pyjwt from 2.11.0 to 2.12.0 by @dependabot[bot] in #2448
- Add docstring guidelines and fix key docstrings for MDX compatibility by @rbren in #2452
- ci: guard package version bumps outside release PRs by @enyst in #2457
- Fix: Add tags to root endpoint for OpenAPI spec by @rbren in #2458
- Fix Python selection in version-bump PR workflow by @neubig in #2430
- Revert PR #2190: Enable ACPAgent on RemoteRuntime API by @enyst in #2451
- test(sdk): reproduce delegate resume compatibility regression by @neubig in #2382
- Enforce REST API deprecation runway in breakage checks by @enyst in #2464
- Use version tag for agent server image in version bump prs by @aivong-openhands in #2427
- Enable ACPAgent on RemoteRuntime API via ACP conversation endpoints by @simonrosenberg in #2465
- Refine temperature/top_p handling for reasoning models by @mayeco in #2277
- chore(deps): bump authlib from 1.6.7 to 1.6.9 by @dependabot[bot] in #2475
- feat(prompt): add AI disclosure policy for external service communications by @xingyaoww in #2476
- feat(build): add OPENHANDS_BUILDKIT_CACHE_MODE env var to control cache export by @simonrosenberg in #2479
- fix: preflight check now validates reasoning_content for thinking models by @juanmichelini in #2420
- Migrate PR review plugin to extensions repository by @juanmichelini in #2324
- Remove multiswebench from CI eval workflow options by @juanmichelini in #2483
- chore(deps): bump pyasn1 from 0.6.2 to 0.6.3 by @dependabot[bot] in #2484
- Add GPT-5.4 to verified models by @juanmichelini in #2487
- fix: synchronize ACP telemetry and refresh remote final state by @simonrosenberg in #2460
- refactor(sdk/subagent): showing tools each subagent has by @VascoSch92 in #2480
- feat: workspace.get_llm() and get_secrets() for OpenHandsCloudWorkspace credential inheritance by @xingyaoww in #2409
- fix: cherry-pick cache_export_seconds telemetry fix to main by @simonrosenberg in #2493
- Add MiniMax-M2.7 to resolve_model_config.py by @juanmichelini in #2500
- chore(deps): bump pypdf from 6.8.0 to 6.9.1 by @dependabot[bot] in #2497
- fix(workflow): remove unused DATASET/SPLIT env vars from run-eval workflow by @VascoSch92 in #2504
- fix(ci): ignore Field deprecated metadata in API breakage check by @enyst in #2508
- chore: add Dependabot configuration for GitHub Actions updates by @aivong-openhands in #2501
- chore(deps): bump docker/login-action from 3 to 4 by @dependabot[bot] in #2520
- chore(deps): bump actions/download-artifact from 6 to 8 by @dependabot[bot] in #2517
- chore(deps): bump actions/setup-node from 4 to 6 by @dependabot[bot] in #2519
- chore(deps): bump actions/github-script from 7 to 8 by @dependabot[bot] in #2521
- chore(deps): bump actions/upload-artifact from 4 to 7 by @dependabot[bot] in #2518
- fix(examples): make the LLM profile store example directory-based by @enyst in #2507
- refactor(llm): use litellm params for reasoning support by @enyst in #1990
- Expose terminalbench in run-eval workflow by @neubig in #2360
- fix(tools): return browser timeout as observation by @neubig in #2455
- Add Google Gemini 3.1 verify models by @mayeco in #2276
- feat(workflow): Expected instance_ids format (no spaces) by @VascoSch92 in #2502
- feat(sdk/agent): Parallel Tool Call Execution by @VascoSch92 in #2390
- fix(ci): ignore added Field metadata in SDK API breakage check by @enyst in #2524
- fix(workflow): Normalize instance_ids by stripping spaces instead of failing by @simonrosenberg in #2529
- feat(websocket): add after_timestamp filter for bi-directional event loading by @jpshackelford in #1880
- build: move SDK SHA args after expensive layers for cache reuse by @simonrosenberg in #2522
- Fix Qwen3.5-Flash low submission rate: improve JSON arg parsing and add corrective feedback by @juanmichelini in #2512
- feat(docker): make ACP npm package installation optional via build arg by @simonrosenberg in #2535
- feat(docker): make boto3 installation optional via build arg by @simonrosenberg in #2536
- feat(docker): add extra_build_args to BuildOptions by @simonrosenberg in #2541
- feat(plugin): Add entry_command field to PluginManifest by @jpshackelford in #2230
- Add
urlfield to PluginAuthor to match Claude Code schema by @jpshackelford in #2546 - feat(sdk): Add browser tool usage guidelines to system prompt by @VascoSch92 in #2547
- fix: use asyncio.Event() for thread-safe initialization state by @ixchio in #2383
- fix(sdk): stop sending reasoning_effort to Kimi thinking by @enyst in #2549
- fix: add diagnostics for preflight proxy failures by @simonrosenberg in #2557
- Fix run-eval to use locked LiteLLM dependency by @simonrosenberg in #2559
- fix(sdk): pin LiteLLM version exactly by @rbren in #2558
- fix(ci): rename PAT secret to PAT_TOKEN by @simonrosenberg in #2561
- fix(ci): use /v1/models for proxy health check instead of /health by @simonrosenberg in #2563
- feat: support pre-built base images for faster rebuilds by...
v1.14.0
What's Changed
- Release v1.13.1 by @all-hands-bot in #2402
- Enforce OpenAPI endpoint deprecations in REST API checks by @enyst in #2405
- Allow connecting agent container to the specified Docker network. by @zlowred in #2381
- Handle fork PRs in API breakage workflows by @enyst in #2410
- Update Nemotron model config by @juanmichelini in #2398
- Add startup banner to SDK with helpful links by @rbren in #2380
- feat: emit structured docker build telemetry by @simonrosenberg in #2422
- Fix broken OpenAPI docs in with path based routing by @tofarr in #2423
- Reject SDK @deprecated on FastAPI routes by @enyst in #2424
- feat: support reusing prebuilt SDK sdists by @simonrosenberg in #2426
- Release v1.14.0 by @all-hands-bot in #2428
New Contributors
Full Changelog: v1.13.1...v1.14.0
v1.13.1
What's Changed
- fix(examples): emit EXAMPLE_COST for marketplace demo by @enyst in #2379
- Release v1.13.0 by @all-hands-bot in #2378
- chore(deps): bump pypdf from 6.7.5 to 6.8.0 by @dependabot[bot] in #2387
- Add bug and feature request issue templates for SDK by @jamiechicago312 in #1864
- Add gpt-5.4 to resolve_model_config.py by @juanmichelini in #2374
- fix: improve conversation resilience for long-running and resumed sessions by @xingyaoww in #2384
- Add NVIDIA Nemotron-3 Super 120B model to resolve_model_config.py by @juanmichelini in #2391
- fix(llm): add provider-specific verified model lists for gemini, deepseek, moonshot, minimax by @juanmichelini in #2386
- fix: truncate long skill descriptions instead of raising errors (#2394) by @xingyaoww in #2395
- chore(deps): bump tornado from 6.5.2 to 6.5.5 by @dependabot[bot] in #2400
- fix(tools): share browser executor across subagents to prevent CDP port conflicts by @VascoSch92 in #2401
- Deprecate {path:path} endpoints in file_router and add query param alternatives by @chuckbutkus in #2404
- Log response content when webhook posting fails by @tofarr in #2403
New Contributors
- @jamiechicago312 made their first contribution in #1864
Full Changelog: v1.13.0...v1.13.1
v1.13.0
- RemoteConversation hooks now work end-to-end with the agent server. hook_config is now forwarded correctly, so remote conversations can execute server-side hooks instead of silently dropping them.
- Agent-server event APIs gained hook observability. Event consumers may now receive HookExecutionEvent objects, and event.source may now be "hook".
- Remote/subagent support expanded. StartConversationRequest now accepts agent_definitions, allowing server-side conversations to see client-registered subagents used by DelegateTool / TaskSetTool.
- SDK usability improved across the release line. Notable additions since v1.12.0 include rerun_actions, configurable marketplace paths, enable/disable support for installed skills/plugins, and new plugin/skill lifecycle examples.
What's Changed
- Release v1.12.0 by @all-hands-bot in #2302
- Remove unnecessary ACP cost estimation fallback by @simonrosenberg in #2330
- fix(sdk/subagent): remote workspace and subagents by @VascoSch92 in #2323
- Add job to print all parameters at start of run-eval workflow by @juanmichelini in #2332
- feat: add skills field to Marketplace schema with GitHub URL support by @neubig in #2325
- fix: compare API breakage checks to baseline PyPI release by @enyst in #2338
- fix: one malformed SKILL.md with invalid YAML frontmatter prevents all sibling skills from loading by @adityavkk in #2333
- feat(sdk/subagent): add profile_store_dir by @VascoSch92 in #2340
- ci: detect ACP minor version bumps by @enyst in #2346
- Add learnings from code review analysis by @neubig in #2280
- feat: add enable/disable for installed plugins by @enyst in #2336
- feat: add enable/disable for installed skills by @enyst in #2322
- refactor(sdk/agent): simplify warning by @VascoSch92 in #2355
- Upgrade to Python 3.13 and fix libtmux locale issue by @neubig in #2092
- refactor(sdk/context/skills): refactoring loading skills by @VascoSch92 in #2353
- chore: redistribute AGENTS guidance by @enyst in #2359
- feat(sdk/subagent): hooks for subagents by @VascoSch92 in #2347
- feat(sdk/subagent): add confirmation policy by @VascoSch92 in #2345
- feat(llm): switch model profile on user message by @VascoSch92 in #2192
- feat: add configurable marketplace_path setting for public skills loading by @neubig in #2253
- feat: add rerun_actions method to replay conversation actions by @csmith49 in #2351
- feat(examples): add plugin and skill lifecycle demos by @enyst in #2362
- fix(tools): issue 2365 by @VascoSch92 in #2369
- fix: send hook_config to server in RemoteConversation by @xingyaoww in #2115
- fix: add security_risk and summary to tool examples for non-native function calling by @ixchio in #2251
- refactor(sdk/llm/mixim): separate data from logic by @VascoSch92 in #2354
- feat(sdk/subagent): add mcp-servers for subagent by @VascoSch92 in #2348
New Contributors
- @adityavkk made their first contribution in #2333
Full Changelog: v1.12.0...v1.13.0
v1.12.0
- ACP / Agent Client Protocol landed end-to-end: new
ACPAgent, visualization of ACP tool calls, andask_agentsupport viafork_session. - Agent-server got more capable and safer: new endpoints for LLM models/providers, better WebSocket auth (headers), improved event search behavior, secrets handling fixes, and a Bedrock safety fix (don’t forward
api_key). - Subagents + skills got a big upgrade: file-based agent definitions (Markdown + YAML frontmatter), built-in specialized agent types (Explore/Bash), support for skills/LLM profiles/max-iterations, and new install utilities for AgentSkills.
- CI and quality gates tightened: stronger API breakage detection/reporting (incl. oasdiff + release gating + deprecation policy), PR review agent guardrails, and a code review rule to block version bumps.
- DevEx + platform compatibility improvements: multiple terminal/tmux reliability fixes (incl. macOS heredoc hang + tmux socket isolation), Windows import guards, plus dependency/model config updates (e.g., Qwen, Gemini preview, GPT-* codex/variant detection, security CVE bumps).
What's Changed
- feat: Add ACPAgent for Agent Client Protocol integration by @simonrosenberg in #2133
- chore: add version bump blocking rule to code review guide by @xingyaoww in #2161
- Release v1.11.5 by @all-hands-bot in #2160
- Fix API breakage check robustness and reporting by @enyst in #2098
- ci: skip api breakage check when prev lacks all by @enyst in #2165
- Add ask_agent support to ACPAgent via fork_session by @simonrosenberg in #2145
- feat(sdk): Add TestLLM class for testing without mocking LiteLLM by @VascoSch92 in #2016
- DRAFT: docs(AGENTS): how to reply to GitHub inline review threads via REST API by @enyst in #2090
- fix(terminal): avoid sending C-l in TmuxTerminal.clear_screen() by @rbren in #2166
- fix(terminal): use dedicated tmux socket to isolate from user sessions by @rbren in #2167
- fix(Makefile): help and pre-commit by @VascoSch92 in #2168
- context: load project skills from git root by @enyst in #2164
- fix: preserve conversation updated_at across server restarts by @rbren in #2172
- ci: Gate API breakage failures on release PRs by @enyst in #2173
- Visualize ACP tool calls by @simonrosenberg in #2162
- Fix long heredoc commands hanging in SubprocessTerminal on macOS by @jpshackelford in #2182
- fix: Include boto3 extra in agent-server Docker image by @rbren in #2188
- Add LLM models and providers endpoints to agent-server by @rbren in #2187
- Do not forward api_key to Bedrock calls by @enyst in #2195
- feat(tools): task tool set by @VascoSch92 in #2143
- fix(tools): remove BrowserToolSet from package init to reduce downstream bundle size by @malhotra5 in #2197
- feat(delegate): File-based agent definitions with markdown + YAML frontmatter by @VascoSch92 in #2183
- agent-server: support WebSocket auth via headers by @enyst in #1814
- Add dashscope/qwen3.5-flash-2026-02-23 model configuration by @juanmichelini in #2207
- chore(deps): bump python-multipart from 0.0.20 to 0.0.22 by @dependabot[bot] in #2210
- chore(deps): bump werkzeug from 3.1.1 to 3.1.6 by @dependabot[bot] in #2214
- chore(deps): bump cryptography from 46.0.3 to 46.0.5 by @dependabot[bot] in #2211
- chore(deps): bump pyasn1 from 0.6.1 to 0.6.2 by @dependabot[bot] in #2212
- Filter public skills by default marketplace by @neubig in #2205
- chore(deps): bump pypdf from 6.1.1 to 6.7.2 by @dependabot[bot] in #2213
- nit(tools): small refactoring test files by @VascoSch92 in #2215
- chore(deps): bump authlib from 1.6.5 to 1.6.6 by @dependabot[bot] in #2219
- chore(deps): bump virtualenv from 20.34.0 to 20.36.1 by @dependabot[bot] in #2220
- ci: enforce agent-server REST API deprecation policy by @enyst in #2232
- fix(sdk): handle newlines in JSON-stringified dict arguments by @VascoSch92 in #2217
- fix(tools): merge subagents metrics (TaskToolSet) by @VascoSch92 in #2222
- fix(tools): merge subagents metrics (DelegateTool) by @VascoSch92 in #2221
- fix(tool): race condition in dynamic Action wrapper class creation by @VascoSch92 in #2224
- Fix GLM-5 preflight check by filtering SDK-specific parameters by @juanmichelini in #2194
- docs(subagent): document loader invariants for file-based agents by @enyst in #2231
- refactoring(sdk): Remove hardcoded header from
get_factory_info()by @VascoSch92 in #2234 - fix(sdk): add gpt-5.2-codex, gpt-5.3-codex, and gpt-5.2 to model-variant detection by @enyst in #2238
- ci: use oasdiff for agent-server REST API breakage detection by @enyst in #2240
- fix: override server_image default to None in DockerDevWorkspace by @simonrosenberg in #2243
- PR review agent: avoid approving eval-risk behavior changes by @enyst in #2246
- Add server-base-path support for VSCode in path-based routing by @chuckbutkus in #2241
- feat: autotitle conversations on first user message by @rbren in #2225
- PR review agent: make eval-risk approval policy repo-specific by @enyst in #2254
- fix: Include secrets in system prompt when added via update_secrets() by @rbren in #2171
- Fix LiteLLM cost tracking for provider-prefixed models by @ShreySatapara in #2257
- Set default temperature to None instead of 0.0 by @neubig in #1989
- fix(tools): guard Unix-only imports in terminal package for Windows by @WolffM in #2096
- feat(delegate): Built-in specialized agent types (Explore, Bash) by @VascoSch92 in #2201
- Change eval_limit from choice to free-form string input by @simonrosenberg in #2261
- fix: cap max_output_tokens when using max_tokens fallback by @csmith49 in #2264
- docs: README for
condensermodule by @csmith49 in #2262 - fix: use query parameters for git API endpoints to preserve path slashes by @chuckbutkus in #2249
- chore(deps): bump mcp from 1.17.0 to 1.23.0 by @dependabot[bot] in #2226
- chore(deps): bump pypdf from 6.7.2 to 6.7.4 by @dependabot[bot] in #2265
- chore(deps): bump pypdf from 6.7.4 to 6.7.5 by @dependabot[bot] in #2267
- Add gemini-3.1-pro-preview model to reasoning models #2269 by @mayeco in #2270
- Show full dynamic context in SystemPromptEvent.visualize by @xingyaoww in #2273
- fix(sdk/subagent): fix get_factory_info() by @VascoSch92 in #2271
- Update orjson to 3.11.7 to address CVE-2025-67221 by @aivong-openhands in #2268
- Fix: Security analyzer ignores LLM security_risk when no analyzer is configured by @juanmichelini in #2130
- chore(deps): bump fastmcp from 2.12.4 to 2.14.0 by @dependabot[bot] in #2266
- Add AGENTS.md for model additio...
v1.11.5
- Add support for Sonnet 4.6, GLM 5, Gemini 3.1 Pro, Minimax M2.5
- Multiple reliability fixes
What's Changed
- Release v1.11.4 by @all-hands-bot in #2011
- Fix workflow_run trigger for version-bump-prs by @xingyaoww in #2014
- Remove enterprise/pyproject.toml updates from version bump workflow by @xingyaoww in #2015
- chore: move skills directory to .agents by @enyst in #1970
- feat: Load project skills (AGENTS.md, etc.) in PR review action by @xingyaoww in #2017
- Feat: session recording agent's browser sessions by @malhotra5 in #1731
- feat: Add current_datetime field to AgentContext by @xingyaoww in #2012
- docs: trim blustery wording in docs update prompt by @smolpaws in #2019
- fix(ci): block PR review auto-run for author_association NONE by @enyst in #2020
- fix(pr-review): avoid blank description in prompt by @enyst in #2021
- feat(security): add GraySwan Cygnal security analyzer by @neubig in #1952
- fix: add circular reference detection to _process_schema_node by @neubig in #1956
- docs(review): clarify approval guidance by @enyst in #2022
- fix: bundle browser recording JS files in PyInstaller build by @malhotra5 in #2025
- Fix issue with remote docker workspaces. by @hamiltop in #1807
- Add GLM-5 to expected models by @juanmichelini in #2029
- fix(llm): ensure second LLM gets independent metrics and telemetry by @xingyaoww in #1997
- fix(pr-review): Set trace metadata within active span context by @neubig in #2062
- Add pr_url metadata to Laminar logging in PR review bot by @neubig in #2068
- Handle deprecated enable_truncation field in TextContent for backward compatibility by @xingyaoww in #2027
- fix(ci): Fix PR review evaluation workflow artifact detection by @neubig in #2070
- DRAFT: Add feature-release-rollout skill for multi-repo feature propagation by @neubig in #2069
- ci: API breakage checks for SDK (Griffe) by @enyst in #1098
- ci: require deprecation marker before exported symbol removal by @enyst in #2072
- ci: extend API breakage checks to openhands-workspace by @enyst in #2075
- Update issue number in integration runner workflow by @enyst in #2079
- SDK: bound latest-user-message scan in Agent.step by @enyst in #1844
- ci: extend API breakage checks to openhands-tools by @enyst in #2080
- docs(sdk): add scoped AGENTS.md by @enyst in #2081
- docs: add package-scoped AGENTS.md guides by @enyst in #2086
- docs: align AGENTS.md guidance across repo by @enyst in #2087
- Enforce deprecation period for removed public methods by @enyst in #2083
- fix: validate provider prefixes in unverified model list by @enyst in #1668
- feat(pr-review): Support multiple models with random selection for A/B testing by @neubig in #2024
- feat(skills): load user skills from ~/.agents/skills by @cbagwell in #2091
- refactor: update PUBLIC_SKILLS_REPO to OpenHands/extensions by @neubig in #2085
- security: fix HTTP-related CVEs in transitive dependencies by @raymyers in #2094
- security: fix protobuf and pillow CVEs in transitive dependencies by @raymyers in #2095
- feat(sdk): add fallback strategy (
LLM) by @VascoSch92 in #2093 - Update sdk_ref default in run-eval.yml during release preparation by @juanmichelini in #1938
- Add claude-sonnet-4-6 to expected models by @juanmichelini in #2102
- fix: make skill loading resilient to individual skill errors by @jpelletier1 in #2108
- Remove jade-spark-2862 alias, add its settings to minimax-m2.5 by @neubig in #2106
- fix: Forcing minimum condenser progress by @csmith49 in #2107
- fix: disable vision for GLM-5 model by @juanmichelini in #2111
- Add preflight LLM check before dispatching evaluations by @neubig in #2109
- Add configurable file editing toolset support by @neubig in #2077
- chore: Rename code-review.md to custom-codereview-guide.md by @xingyaoww in #2121
- Disable uv Actions cache in PR review agent workflow by @enyst in #2123
- fix: Move litellm install before model loading and use correct API key by @juanmichelini in #2118
- Fix: Make litellm import lazy in resolve_model_config.py by @juanmichelini in #2125
- feat(hooks): add async hook execution support by @ixchio in #1849
- feat(condenser): Explicit view properties by @csmith49 in #2116
- Fix: Add 180-minute timeout to integration tests to prevent indefinite hangs by @juanmichelini in #2131
- Fix: Add claude-sonnet-4-6 to EXTENDED_THINKING_MODELS by @juanmichelini in #2138
- Enable Datadog persistence by default in eval job by @juanmichelini in #2140
- Fix PR review workflow by @enyst in #2126
- fix: Disable browser tools in integration tests to fix ProcessPoolExecutor hang by @neubig in #2149
- feat: Add Gemini 3.1 Pro to evaluation model config by @neubig in #2153
- CI: simplify pr-review gating and skip forks by @enyst in #2159
- Add API compliance test framework for malformed message patterns by @csmith49 in #2155
- Update integration tests to use claude-sonnet-4-6 by @xingyaoww in #2113
- Docs: fix README links and typos by @enyst in #2158
- Add claude-sonnet-4-6 to verified models by @juanmichelini in #2104
- Add event-sourcing system benchmarks by @simonrosenberg in #2032
New Contributors
- @smolpaws made their first contribution in #2019
- @jpelletier1 made their first contribution in #2108
Full Changelog: v1.11.4...v1.11.5
v1.11.4
What's Changed
- fix: serialize Laminar span_context UUIDs as strings for JSON compatibility by @neubig in #2000
- feat: add previous review context to PR review agent by @xingyaoww in #1991
- Separate version bump PRs into standalone workflow by @xingyaoww in #2003
- Release v1.11.3 by @all-hands-bot in #2002
- fix: fetch latest reviews/threads instead of oldest by @xingyaoww in #2004
- Fix poetry lock command in version-bump-prs workflow by @xingyaoww in #2006
- Remove redundant poetry lock calls in version-bump-prs workflow by @xingyaoww in #2008
- Add MiniMax-M2.5 model configuration by @neubig in #2007
- Add pre-commit step to version-bump-prs workflow by @xingyaoww in #2010
Full Changelog: v1.11.3...v1.11.4
v1.11.3
What's Changed
- Release v1.11.2 by @all-hands-bot in #1976
- fix: exclude reference markdown files in agentskills folders from being loaded as skills by @xingyaoww in #1982
- Add stop hook for pre-commit, pytest, and CI validation and add /api/hooks to agent-server by @xingyaoww in #1878
- feat(eval): add jade-spark-2862 model for evaluation by @neubig in #1984
- ci: switch integration tests to use eval proxy by @neubig in #1985
- fix: separate static system prompt from dynamic context for cross-conversation caching by @neubig in #1890
- Fix Laminar trace continuation for PR review evaluation by @neubig in #1988
- fix(llm): detect context window errors from Minimax via APIConnectionError by @neubig in #1992
- feat: add rejection_source field to UserRejectObservation by @xingyaoww in #1995
- docs(AGENTS.md): add GraphQL examples for resolving review threads by @neubig in #1993
Full Changelog: v1.11.2...v1.11.3
v1.11.2
What's Changed
- Release v1.11.1 by @all-hands-bot in #1921
- Add litellm_proxy/gpt-oss-20b to resolve_model_config.py by @juanmichelini in #1906
- fix: Remove python-version from setup-uv in create-version-bump-prs job by @juanmichelini in #1926
- Add Claude Opus 4.6 model support by @neubig in #1933
- Set default sdk_ref to v1.11.1 in run-eval workflow by @juanmichelini in #1935
- feat(llm): add gpt-5.3-codex to subscription models by @enyst in #1940
- Add example for reconstructing OpenAI messages from events by @enyst in #1916
- feat: add /ready endpoint for proper Kubernetes readiness checks by @neubig in #1810
- fix(llm): Add Claude Opus 4.6 to model features for reasoning_effort support by @simonrosenberg in #1941
- nit(sdk): change usage_to_llm to return a MappingProxyType by @VascoSch92 in #1930
- fix: use Python 3.12 for openhands-cli version bump by @xingyaoww in #1924
- feat(ci): add model_ids and issue_number inputs to integration-runner by @neubig in #1883
- Add Laminar traces to PR review workflow by @juanmichelini in #1949
- fix(message): Responses tool image serialization (preserve file_editor images) by @Wangmerlyn in #1895
- fix: disable vision for glm-4.7 to prevent multimodal evaluation failures by @juanmichelini in #1898
- Fix tool serialization test flakiness by @enyst in #1959
- refactor: Use composite GitHub Action for PR review workflow by @xingyaoww in #1927
- chore: add review thread gate by @enyst in #1962
- fix: update LLM model and base URL for pr-review workflow by @xingyaoww in #1968
- feat: Add delayed evaluation for PR review with Laminar signals by @neubig in #1954
- Add context window size validation for local LLMs by @neubig in #1961
- feat(sdk): introduce
LLMProfileStorefor persisted LLM configurations by @VascoSch92 in #1928 - fix(pyinstaller): include delegate tool templates in bundle by @malhotra5 in #1971
- fix: handle PS1 metadata corruption in command output by @neubig in #1817
- Enhance planning agent to ask clarifying questions for ambiguous requests by @xingyaoww in #1967
- feat(examples): update critic example to demonstrate iterative refinement by @xingyaoww in #1879
New Contributors
- @VascoSch92 made their first contribution in #1930
Full Changelog: v1.11.1...v1.11.2
v1.11.1
What's Changed
- Release v1.11.0 by @all-hands-bot in #1884
- Fix license by @enyst in #1869
- Add qwen3-coder-next to expected models by @juanmichelini in #1892
- utils(release): Fix poetry lock conflict and move openhands-cli step earlier by @xingyaoww in #1888
- Change qwen3-coder-next model from together.ai to openrouter by @juanmichelini in #1900
- Fix truncation at LLM message layer by @enyst in #1838
- Add qwen3-coder-30b-a3b-instruct model to resolve_model_config.py by @juanmichelini in #1903
- feat(llm): add gpt-5.2-codex to verified models by @juanmichelini in #1893
- fix: serialize tmux session creation to prevent race conditions by @neubig in #1889
- Update code-review skill with repo-specific approval guidelines by @xingyaoww in #1904
- fix(mcp): handle timeout errors gracefully in MCP tool execution by @neubig in #1862
- feat(llm): add Kimi K2.5 to verified models by @juanmichelini in #1907
- Add conversation.execute_tool() method for pre-run tool execution by @xingyaoww in #1833
- Clarify execute_tool bypasses confirmation/security checks by @enyst in #1917
- chore: remove blacksmith CI runners and use GitHub's default runners by @xingyaoww in #1915
- fix: revert agent-server Docker image to Python 3.12 by @neubig in #1910
- Cache Qwen3 tokenizer config for critic template tests by @enyst in #1919
- fix: wait for WebSocket terminal status to prevent event loss by @xingyaoww in #1832
- Add skill for debugging test-examples workflow by @xingyaoww in #1887
- feat(skills): support .agents/skills directory by @enyst in #1914
- fix: use blacksmith runners for all test jobs to fix coverage by @xingyaoww in #1920
Full Changelog: v1.11.0...v1.11.1