[Discussion] Client-side tools inside the codemode execute sandbox

## Context

The codemode connector runtime (`@cloudflare/codemode` + Think's `createExecuteTool`) adapts AI SDK `ToolSet`s into the sandbox via `ToolSetConnector`. Tools **without an `execute` function** — client-side tools resolved in the browser (`getUserTimezone`, `ask_user`, …) — are currently **excluded** from both the sandbox bindings and the generated types, with a one-time warning. The model can still call them as ordinary top-level tools, just not from inside `execute` code.

This issue is to discuss whether (and how) to bridge them.

## Option A — status quo (skip + warn)

Client tools stay top-level. Usually fine: they're interactive one-offs that rarely benefit from being batched inside sandboxed code. Zero new surface area.

## Option B — generalize the pause machinery into value-carrying resolution

The runtime already pauses durably on `requiresApproval` and resumes via `approve()`. A client tool is the same shape with one twist: instead of approve → *the server executes the call*, the **client computes the result and posts it back**. Sketch:

1. **Runtime**: a `resolve(executionId, seq, result)` RPC that flips a `pending` log entry straight to `applied` with the supplied result, then resumes the run — `approve`/`reject` become special cases of resolution.
2. **`ToolSetConnector`**: expose execute-less tools as pause-always entries, annotated e.g. `resolution: "client"` so hosts/UIs can distinguish "needs a human yes/no" from "needs the client to run something".
3. **Think**: surface these in `pendingExecutions()`; the SPA routes them through its existing `onToolCall` handler and posts the result back via a `resolveExecution(executionId, seq, result)` callable. The transcript's paused tool output is replaced and the chat auto-continues — identical flow to the existing approval cards.

Model-written code could then do:

```ts
const tz = await tools.getUserTimezone({});
```

and the run durably parks until the browser answers, with abort-and-replay handling the rest for free (the resolved value is recorded in the log and replays like any applied result).

### Costs / open questions

- **Trust surface**: the client supplies a recorded "result" that replays as ground truth. Size limits apply (`MAX_DURABLE_VALUE_BYTES`), but validation against the tool's *output* shape doesn't exist today.
- **UX**: multiple pending interactions per run (a paused run currently exposes one pending action at the abort point — client tools would keep that property, but chained client calls mean pause → resolve → pause → resolve round trips, each a full replay pass).
- **Expiry semantics**: `expirePaused` would reject runs waiting on a client that never answers — probably the right default, but worth stating.
- **Offline clients**: a run paused on a client tool with no connected client is stuck until expiry; should `pendingExecutions()` distinguish these so UIs can prompt reconnection?

## Proposal

Keep Option A for the current release (already shipped in the connector-runtime PR). Build Option B only when a concrete use case needs batched client interactions inside sandbox code — the design above shows it's an incremental extension of the existing approval flow rather than a rework.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Discussion] Client-side tools inside the codemode execute sandbox #1735

Context

Option A — status quo (skip + warn)

Option B — generalize the pause machinery into value-carrying resolution

Costs / open questions

Proposal

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Discussion] Client-side tools inside the codemode execute sandbox #1735

Description

Context

Option A — status quo (skip + warn)

Option B — generalize the pause machinery into value-carrying resolution

Costs / open questions

Proposal

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions