Skip to content

feat(selection): stream select_path requests and resolve picks mid-turn#77

Open
hazemahmedx0 wants to merge 3 commits into
stagingfrom
feat/inline-path-selector
Open

feat(selection): stream select_path requests and resolve picks mid-turn#77
hazemahmedx0 wants to merge 3 commits into
stagingfrom
feat/inline-path-selector

Conversation

@hazemahmedx0

@hazemahmedx0 hazemahmedx0 commented Jun 21, 2026

Copy link
Copy Markdown
Member

This is the server half of the inline path picker. anton can now ask for a file or folder mid-turn through a select_path tool, but it needs a host to carry that request out to the client and feed the answer back into the same paused turn. That's what this does.

Before

image

After

Screenshot 2026-06-22 at 2 16 27 AM

https://www.loom.com/share/5495d1aff1a7488db532fbdbd0f55926

What's in here:

  • A StreamingSelectionElicitor that implements anton's elicitor protocol. When the agent asks for a path, it emits a selection event onto the live SSE stream and then waits on a future for the answer.
  • A SelectionGateway that holds the open requests as asyncio futures, keyed by conversation and request id, so the pick that arrives on a separate HTTP call can resolve the exact turn that's waiting.
  • A /responses/selection endpoint the client posts the chosen path to. It resolves the future, the tool returns, and the turn continues. A 404 means nothing was waiting.
  • The merge queue between the agent turn and the SSE response now uses bounded async puts instead of put_nowait, so a slow client applies backpressure instead of dropping events.

Two correctness fixes worth calling out:

  • The elicitor is only turned on for interactive turns. A non-interactive caller never gets a picker it can't answer.
  • Cancellation is re-raised instead of swallowed, so hitting Stop while a pick is pending actually tears the turn down. A user dismissing the picker is a separate thing, represented by the future resolving with null.

One removal: the in-app /fs directory-listing endpoint is gone. The client now uses the native OS picker, so the server no longer needs to list the user's filesystem over HTTP. Less surface area, and the agent already has its own file access.

Depends on the anton select_path change. The client PR for cowork sits on top of this one.


Part of the inline path picker stack. Merge bottom up:

  1. anton: feat(select_path): inline file/folder picker for ambiguous paths anton#197 (the select_path tool and elicitor protocol)
  2. cowork-server: feat(selection): stream select_path requests and resolve picks mid-turn #77 (this PR, streams the request, resolves the pick)
  3. cowork: feat(selector): inline native path picker for ambiguous paths cowork#212 (the native OS picker UI)

Related but independent: mindshub_inference #302 (https://github.com/mindsdb/mindshub_inference/pull/302) is the image side of the same multimodal and file effort. It translates image_url blocks for every provider and transcodes unsupported image types to PNG. It does not depend on this stack and this stack does not depend on it.

Bridges anton's select_path tool to the GUI without a turn boundary:

- SelectionGateway: process-global rendezvous of (conversation_id, request_id)
  -> asyncio.Future, resolved out-of-band by the new endpoint.
- StreamingSelectionElicitor (anton's SelectionElicitor protocol): emits a
  SelectionRequestEvent into the turn stream, then awaits the gateway future.
- harness.stream_response now drives turn_stream in a pump task feeding a merge
  queue, so the request event reaches the client before the tool blocks; cancels
  cleanly and unblocks pending selections on turn end.
- stream_formatter emits response.selection.requested SSE for the event.
- POST /responses/selection delivers the chosen path (or null cancel) into the
  paused turn; 404 when nothing is awaiting.
GET /fs/list lists a directory's children (folders first; files when kind!=folder;
hidden hidden by default; capped; loopback desktop scope). SelectionRequestEvent
and the SSE frame now carry mode + root so the UI can render a browser.
…active

Drops the in-app /fs directory-listing endpoint now that the client uses the
native OS picker, so the server no longer lists the user's filesystem. The
streaming path turns the selection elicitor on only for interactive turns,
backpressures the merge queue with a bounded async put instead of put_nowait,
and re-raises cancellation so a Stop actually tears the turn down rather than
being swallowed as an empty pick.
@hazemahmedx0 hazemahmedx0 reopened this Jun 21, 2026
@hazemahmedx0 hazemahmedx0 requested a review from torrmal June 21, 2026 23:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant