fix(subagent): dispatch subagents through per-agent provider; enforce allowlist on self-spawn; attribute responses by securityguy · Pull Request #1814 · sipeed/picoclaw

securityguy · 2026-03-20T03:11:20Z

Problems

1. Subagent always used the global provider

SubagentManager was initialised with the single global LLMProvider passed into NewAgentLoop, plus the calling agent's model name string (e.g. "openrouter-gpt-5.4"). When the global provider happened to be claude-cli, every spawn attempt failed because claude was invoked with --model openrouter-gpt-5.4 — a picoclaw model alias it does not recognise.

2. Named agent delegation was unimplemented scaffolding

The spawn tool accepts an agent_id parameter described as "Optional target agent ID to delegate the task to", and SubagentTask stores the value — but runTask always used sm.provider and sm.defaultModel regardless. Spawning agent_id="karen" ran with Amber's provider, not Karen's.

3. Allowlist did not cover self-spawns

The allowlist check in SpawnTool only ran when agent_id != "". An empty agent_id bypassed the check entirely, so allow_agents: ["karen"] could not prevent an agent from spawning itself.

4. Subagent responses had no attribution

When a named subagent completed a task its raw output was published directly to the user with no indication of which agent produced it.

Changes

pkg/tools/subagent.go

NewSubagentManager now takes a SubagentManagerConfig struct that accepts a ProviderDispatcher, FallbackChain, the calling agent's own SelfCandidates, its CallerAgentID, and a CandidateResolver callback for looking up other agents' candidates.
Added resolveLoopConfig(agentID): uses selfCandidates for self-spawns and resolves the target agent's candidates via CandidateResolver for named spawns, so each subagent runs with its own configured provider and model.
Added attributedContent(agentID, content): prefixes the user-visible result with the agent's name in bold when a named agent produced it. Self-spawns are unaffected.

pkg/tools/toolloop.go

ToolLoopConfig gains Dispatcher, Candidates, and Fallback fields. When set, RunToolLoop dispatches per-candidate through the ProviderDispatcher with full fallback chain support, mirroring the main agent loop. Falls back to the existing Provider/Model pair when not configured.

pkg/tools/spawn.go

Self-spawns (empty agent_id) now resolve to the caller's own ID before the allowlist check, so allow_agents: ["karen"] correctly rejects an unqualified spawn in addition to blocking explicit agent_id="amber" calls.

pkg/agent/loop.go

registerSharedTools accepts dispatcher and fallbackChain and constructs each agent's SubagentManager with its own candidates, caller ID, and a registry-backed candidate resolver closure.
The fallback chain is built before registerSharedTools is called so it can be shared with both the main loop and the subagent managers.
ReloadProviderAndConfig builds a fresh fallback chain for the reloaded registry.

Some providers (via OpenRouter) reject assistant messages with "content": "" alongside tool_calls. The OpenAI spec permits content to be absent when tool_calls is set. Switch openaiMessage.Content from string to *string with omitempty and introduce msgContent() to return nil when content is empty and tool calls are present. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ields Some OpenAI-compatible providers (e.g. OpenRouter routing to strict backends) reject non-standard fields in the request body such as reasoning_content in messages and extra_content / thought_signature in tool calls. Add a per-model strict_compat: true config option that strips these fields before serialization. Implementation: - Add StrictCompat bool to config.ModelConfig - Add WithStrictCompat option to openai_compat.Provider - Refactor HTTPProvider constructors into a single NewHTTPProviderWithOptions using variadic openai_compat.Option, eliminating the growing list of named constructors - Thread StrictCompat through CreateProviderFromConfig via composed options Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

When the claude CLI exits with a non-zero status, the previous error handler only checked stderr. However, the CLI writes its output (including error details) to stdout, especially when invoked with --output-format json. This left the caller with only "exit status 1" and no actionable information. Now includes both stderr and stdout in the error message so the actual failure reason is visible in logs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Add claude-cli and codex-cli to the supported vendors table and include vendor-specific configuration examples explaining: - No API key is required (uses existing CLI subscription) - The claude-code sentinel model ID skips --model flag so the CLI uses its own configured default model Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Add channels.telegram_bots config allowing multiple Telegram bot tokens to be configured, each mapped to a separate channel (e.g. telegram-amber, telegram-karen). Each channel can be independently bound to an agent via the bindings config, enabling distinct AI personas behind separate bots. Backward compatibility is preserved: the existing channels.telegram single-entry config continues to work unchanged. On load it is normalized into telegram_bots as an entry with id "default", which produces the channel name "telegram" so all existing bindings remain valid. Key changes: - config: add TelegramBotConfig struct with ChannelName/AsTelegramConfig helpers; add TelegramBots field to ChannelsConfig; normalize legacy single entry into list on load - telegram: add NewTelegramChannelFromConfig constructor accepting TelegramConfig + explicit channel name (avoids import cycle) - channels: add TelegramBotFactory registry; add injectChannelDependencies helper to eliminate injection code duplication; add duplicate channel name guard in initTelegramBot; update initChannels to iterate over TelegramBots; add prefix-based rate limit fallback for named bots Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…le.json Add two disabled example bots (alice, bob) under channels.telegram_bots and corresponding top-level bindings to illustrate how multiple Telegram bots map to separate named channels and agents. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Adds GeminiCliProvider that wraps the Gemini CLI as a subprocess, following the same pattern as the existing claude-cli and codex-cli providers. The provider invokes: gemini --yolo --output-format json --prompt "" with the prompt sent via stdin. The --prompt "" flag enables non-interactive (headless) mode, reading the full prompt from stdin. Key details: - Model sentinel: "gemini-cli" skips --model flag (uses CLI default) - Explicit model: "gemini-cli/gemini-2.5-pro" passes --model gemini-2.5-pro - System messages prepended to stdin (no --system-prompt flag in gemini) - Parses JSON response format: {"response": "...", "stats": {"models": {...}}} - Token usage summed across all models in stats.models (gemini uses multiple internal models per request) - Tool calls extracted from response text using shared extractToolCallsFromText - New protocol: "gemini-cli" / alias "geminicli" Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Add PR sipeed#1633 (gemini-cli provider) to contributions table - Add configuration guide section covering: - claude-cli, codex-cli, and gemini-cli providers with model_list examples - Multiple Telegram bots with bindings and per-agent config - Agent workspace and personality file notes Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Adds GeminiCliProvider (PR sipeed#1633 against sipeed/picoclaw). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Replace Amber/Karen with Alice/Bob in all README examples for consistency. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Previously all agents shared a single LLMProvider instance created from agents.defaults.model_name. Per-agent model config (agents.list[].model) only changed the model string passed to Chat() — it never changed which provider binary was invoked. This caused cross-provider fallback chains (e.g. gemini-cli falling back to claude-cli) to fail, and made it impossible to assign different CLI providers to different agents. Introduces ProviderDispatcher which lazily creates and caches provider instances keyed by "protocol/modelID". The fallback chain's run closure now resolves the correct provider via the dispatcher before falling back to agent.Provider for backward compatibility. References sipeed#1634 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Brings in ProviderDispatcher fix (PR sipeed#1637 against sipeed/picoclaw). References sipeed#1634. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ndling Tool call detection previously relied on a literal strings.Index for '{"tool_calls"' which failed whenever the LLM returned pretty-printed JSON (newline after '{') or wrapped the output in markdown code fences. Arguments typed as a JSON object instead of an encoded string also caused a silent parse failure and leaked the raw JSON block to the user. Changes: - Strip markdown code fences (` + "```json" + ` / ` + "```" + `) before parsing - Locate JSON candidate via first '{' / last '}' instead of literal match - Unmarshal directly and check for top-level "tool_calls" key - Accept arguments as either a JSON-encoded string or a plain JSON object - Remove dead findMatchingBrace function and its tests - Publish response.Content to the user immediately when a response contains both text and tool calls (previously the text was silently discarded into session history) - Fix pre-existing test bug: TestCreateProvider_GeminiCliDefaultWorkspace now clears Agents.Defaults.Workspace before testing the '.' fallback Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

requiresRuntimeProbe and probeLocalModelAvailability handled claude-cli and codex-cli but not gemini-cli, causing the launcher to report "default model has no credentials configured" and skip auto-start when gemini-cli was set as the default model. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

TryAutoStartGateway only checked gateway.cmd, which tracks processes the launcher itself spawned. A gateway managed externally (e.g. via systemd) was invisible to this check, causing two problems: 1. The launcher started a duplicate gateway instance on every launch. 2. The WebUI showed "Gateway Not Running" even when it was healthy. Fix: probe the gateway health endpoint in two places: - TryAutoStartGateway: skip auto-start if the health endpoint responds. - gatewayStatusData: report "running" when the launcher has no owned process but the health endpoint is responding. Launcher-owned transition states (restarting/error) take precedence over the probe. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Brings in launcher external gateway detection fix (PR sipeed#1811 against sipeed/picoclaw). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The --system-prompt flag exposed agent instructions and tool definitions in the process argument list (visible via ps to all users on the host) and risked hitting OS ARG_MAX limits when many tools are registered. System prompt content is now prepended to the stdin payload, separated from the user message by a --- delimiter. This is consistent with how the gemini-cli and codex-cli providers already handle all input. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…nd mixed response handling

… allowlist on self-spawn Previously SubagentManager was initialised with the global provider and the calling agent's model name string (e.g. "openrouter-gpt-5.4"). When the global provider happened to be claude-cli this caused it to be invoked with --model openrouter-gpt-5.4, which claude does not recognise, blowing up every spawn attempt. Two problems fixed together: 1. Provider dispatch: SubagentManager now holds the ProviderDispatcher and the calling agent's model candidates. When a subagent is spawned it resolves the correct provider through the same per-candidate dispatch used by the main agent loop. When agent_id names a different agent, that agent's candidates are resolved via a registry callback so the subagent runs with the target agent's configured model (e.g. spawning "karen" uses karen's claude-cli, not amber's openrouter). 2. Self-spawn allowlist: the allowlist check previously only ran when agent_id was explicitly set. Empty agent_id (self-spawn) now resolves to the caller's own ID before the check, so allow_agents: ["karen"] on amber correctly rejects an unqualified spawn attempt. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

When a named subagent (e.g. agent_id="karen") completes a task, its response is now prefixed with the agent's name in bold so the user can tell which agent produced the result. Self-spawns (no agent_id) are unaffected. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

yinwm

Thanks for the work — the core subagent dispatch design (ProviderDispatcher, candidateResolver, self-spawn allowlist fix, attribution) is well thought out.

However, I'm requesting changes for three reasons:

1. PR bundles 10+ unrelated features

This single PR includes gemini-cli provider, multi-Telegram bots, strict_compat, tool call extraction improvements, claude-cli stdin system prompt, launcher gateway detection, README rewrite, dependabot removal, and the ProviderDispatcher — all mixed together. Each should be a separate PR against main so they can be reviewed, tested, and landed independently.

2. Outdated base with guaranteed conflicts

This branch is based on ~March 12 main. PRs #2038 (LightProvider) and #2068 (config/security refactor) have since been merged, causing conflicts in loop.go, instance.go, and loop_test.go. A rebase onto latest main is needed.

3. Architectural overlap with #2098

#2098 takes a simpler approach (direct per-agent provider creation in NewAgentInstance) while this PR introduces ProviderDispatcher as a caching middleware layer. The project needs a single consistent direction — maintainer should decide before both land.

Suggested path forward:

Extract subagent dispatch + ProviderDispatcher as a standalone PR (rebased on latest main)
Submit gemini-cli, multi-Telegram, strict_compat, etc. as separate PRs
Coordinate with #2098 author on unified provider resolution approach

Happy to review each PR individually.

securityguy · 2026-03-28T17:38:12Z

Thank you for the feedback. Each fix was developed on its own branch and submitted as a separate PR, however since I needed the software to be functional, I merged each change into my local main as I went. I apologize — I didn't anticipate that this would cause the subsequent PR branches to carry the accumulated history, effectively combining changes from multiple PRs.

Given the number of changes to your main branch and the volume of outstanding PRs, I'm not in a position to rebase and resubmit at this time. I hope the individual branches and their commit messages are still useful as a reference please feel free to use any of the code you find helpful, and to close any PRs that are no longer relevant.

securityguy · 2026-03-28T17:47:49Z

FYI, I've left all the branches here in case you find them helpful:

https://github.com/securityguy/picoclaw/branches/yours

securityguy and others added 30 commits March 15, 2026 20:53

docs: add pending contributions table to README

ab1c122

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

docs: replace README with minimal dev-fork version

d68d709

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Update README.md

ade27cc

chore: remove dependabot.yml (dev fork, not needed)

be68a49

Update README.md

64b5b37

Merge feat/gemini-cli-provider into main

717bb76

Adds GeminiCliProvider (PR sipeed#1633 against sipeed/picoclaw). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

docs: use Alice and Bob as example agent names throughout

0c856de

Replace Amber/Karen with Alice/Bob in all README examples for consistency. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Merge fix/provider-dispatch: per-candidate provider dispatch

8938eb7

Brings in ProviderDispatcher fix (PR sipeed#1637 against sipeed/picoclaw). References sipeed#1634. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

docs: add PR sipeed#1637 to contributions table

29ec94e

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Update README.md

8ac3006

Merge fix/launcher-detect-external-gateway: detect external gateway

86e68a2

Brings in launcher external gateway detection fix (PR sipeed#1811 against sipeed/picoclaw). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

docs: add PRs sipeed#1810 and sipeed#1811 to contributions table

1010821

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Merge fix/claude-cli-system-prompt-stdin: pass system prompt via stdin

f067024

docs: add PR sipeed#1812 to contributions table

7d5745c

Merge fix/cli-tool-call-extraction: robust CLI tool call extraction a…

837304c

…nd mixed response handling

docs: add PR sipeed#1813 to contributions table

cab1d19

securityguy added a commit to securityguy/picoclaw that referenced this pull request Mar 20, 2026

docs: add PR sipeed#1814 to contributions table

3aafbca

sipeed-bot bot added type: bug Something isn't working domain: tool domain: provider go Pull requests that update go code labels Mar 20, 2026

yinwm requested changes Mar 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(subagent): dispatch subagents through per-agent provider; enforce allowlist on self-spawn; attribute responses#1814

fix(subagent): dispatch subagents through per-agent provider; enforce allowlist on self-spawn; attribute responses#1814
securityguy wants to merge 31 commits intosipeed:mainfrom
securityguy:fix/subagent-provider-dispatch

securityguy commented Mar 20, 2026

Uh oh!

yinwm left a comment

Uh oh!

securityguy commented Mar 28, 2026

Uh oh!

securityguy commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

securityguy commented Mar 20, 2026

Problems

1. Subagent always used the global provider

2. Named agent delegation was unimplemented scaffolding

3. Allowlist did not cover self-spawns

4. Subagent responses had no attribution

Changes

Uh oh!

yinwm left a comment

Choose a reason for hiding this comment

Uh oh!

securityguy commented Mar 28, 2026

Uh oh!

securityguy commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants