Skip to content

feat(cli): add call subcommand for one-shot tool invocation#73

Merged
stevenobiajulu merged 3 commits intomainfrom
72-call-subcommand-20260503
May 3, 2026
Merged

feat(cli): add call subcommand for one-shot tool invocation#73
stevenobiajulu merged 3 commits intomainfrom
72-call-subcommand-20260503

Conversation

@stevenobiajulu
Copy link
Copy Markdown
Member

Summary

  • Add email-agent-mcp call <tool> --args '<json>' so MCP tools can be invoked one-shot from a fresh process — no long-lived MCP host required.
  • Sidesteps the "code change → restart Claude Code → MCP picks up new build" cycle. Side benefit: tools become scriptable from cron, launchd, hooks, and shell pipelines.
  • Also exposes --list (enumerate tools) and --schema (print input JSON Schema), and accepts args via --args / --args-file / --args-stdin.

Closes #72.

Architecture

Extracted executeTool() in server.ts as the shared dispatch primitive. handleToolCall() (MCP transport) becomes a thin wrapper that calls executeTool() then formats the result into the MCP content envelope (preserving the download_attachment resource special-case). runCall() in the CLI calls executeTool() then writes raw JSON to stdout.

This keeps serve and call parity by construction — they share the same action registry and the dispatch path. Only the output formatting diverges.

Differences vs. serve:

  • Eager provider init (no demo-mode fallback) — auth failures surface as exit codes instead of masquerading as connecting results
  • Snapshot allowlist (no WatchedAllowlist FS watcher in a one-shot process)

Output

  • stdout = JSON tool result (pretty-printed for TTY, compact for pipe via process.stdout.isTTY)
  • stderr = logs / errors only, so call ... | jq works
  • Exit codes:
    • 0 success
    • 2 CLI / argument / schema / unknown-tool error
    • 3 typed tool failure ({success:false,...}) or runtime throw

Out of scope (per peer review with Codex + Gemini)

  • Schema-generated per-tool flags — heterogeneous tool inputs, low ROI vs. the --args JSON path
  • JSON-RPC over stdin as a public surface — confusing UX, requires MCP envelope knowledge; keep as internal debug if useful later
  • Cross-process token-refresh locking — existing single-flight is process-local; mostly fine for read-path call invocations, low risk

Test plan

  • 5 new openspec scenarios under cli/Call Subcommand + cli/Exit Codes
  • Unit: parseCliArgs handles call / --args / --args-file / --args-stdin / --list / --schema
  • Unit: executeTool returns raw action result; handleToolCall wraps it (regression)
  • Unit: getActionInputJsonSchema exposes input schema
  • Unit: runCall exits with 2 on unknown tool, malformed JSON, missing tool name
  • Unit: runCall exits with 3 on typed tool failure (via vi.doMock stub)
  • Unit: formatJsonForOutput pretty-vs-compact based on isTty
  • 177/177 tests pass, 194/194 spec coverage
  • Smoke-tested live against real Microsoft Graph API:
    • call --list returned all 17 tools
    • call list_emails --schema returned valid JSON Schema
    • call list_emails --args '{"limit":2}' returned live Graph results

The MCP server is a long-lived stdio process: source-code changes don't
take effect until the parent harness (Claude Code, Cursor, etc.) restarts.
Add a CLI subcommand that runs the same actions in a fresh process per
invocation, sidestepping the restart cycle and making tools scriptable
from cron, launchd, hooks, and shell pipelines.

Surface:
  email-agent-mcp call <tool> --args '<json>'
  email-agent-mcp call <tool> --args-file <path>
  email-agent-mcp call <tool> --args-stdin
  email-agent-mcp call --list                    # enumerate tools
  email-agent-mcp call <tool> --schema           # print input JSON Schema

Architecture: extract `executeTool()` as the shared dispatch primitive.
`handleToolCall()` (MCP transport) becomes a thin wrapper that calls
`executeTool()` then formats the result into the MCP `content` envelope
(preserving the `download_attachment` resource special-case). `runCall()`
in the CLI calls `executeTool()` then writes raw JSON to stdout. This
keeps `serve` and `call` parity by construction — they share the action
registry and the dispatch path, only the output formatting diverges.

Differences vs. `serve`:
- Eager provider init (no demo-mode fallback) — auth failures surface as
  exit codes instead of masquerading as `connecting` results
- Snapshot allowlist (no WatchedAllowlist FS watcher in a one-shot process)

Output:
- stdout = JSON tool result (pretty for TTY, compact for pipe via
  `process.stdout.isTTY`)
- stderr = logs / errors only, so `call ... | jq` works
- Exit codes: 0 success, 2 CLI/argument/schema/unknown-tool error,
  3 typed tool failure ({success:false,...}) or runtime throw

Out of scope (per peer review with Codex + Gemini):
- Schema-generated per-tool flags (heterogeneous tool inputs, low ROI vs.
  the --args JSON path)
- JSON-RPC over stdin as a public surface (confusing UX, requires MCP
  envelope knowledge — keep as internal debug if useful later)
- Cross-process token-refresh locking (existing single-flight is
  process-local; mostly fine for read-path `call` invocations, low risk)

Closes #72.

Test coverage:
- 5 new openspec scenarios under cli/Call Subcommand + cli/Exit Codes
- Unit: parseCliArgs handles call/--args/--args-file/--args-stdin/--list/--schema
- Unit: executeTool returns raw action result; handleToolCall wraps it
- Unit: getActionInputJsonSchema exposes input schema
- Unit: runCall exits with 2 on unknown tool, malformed JSON, missing tool name
- Unit: runCall exits with 3 on typed tool failure (vi.doMock stub)
- Unit: formatJsonForOutput pretty-vs-compact based on isTty
- 177/177 tests pass, 194/194 spec coverage
- Smoke-tested live against real Microsoft Graph API
@codecov
Copy link
Copy Markdown

codecov Bot commented May 3, 2026

Codecov Report

❌ Patch coverage is 73.58491% with 28 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
packages/email-mcp/src/cli.ts 72.00% 20 Missing and 8 partials ⚠️

📢 Thoughts on this report? Let us know!

End-to-end smoke testing surfaced a critical pre-existing bug and Gemini
peer review flagged three smaller cleanups in the new `call` subcommand.

**Critical: bin discarded runCli's exit code (regression now pinned)**

`packages/email-agent-mcp/bin/email-agent-mcp.js` previously called
`runCli()` and only handled thrown errors with `process.exit(1)`. Non-zero
return values (e.g., `2` from invalid args, `2` from unknown command) were
silently discarded — `email-agent-mcp call no_such_tool` returned `0` to
the shell despite printing an error. Affected ALL subcommands; just first
visible with `call`'s scriptable surface.

Fix: route the bin through the existing `runCliDirect` helper, which sets
`process.exitCode` (not `process.exit()`) so `serve` stays alive for the
MCP stdio handshake while one-shot commands propagate their codes. Export
`runCliDirect` from `@usejunior/email-mcp/index.ts`. Added a regression
test under `cli/Direct entrypoint lifecycle` that pins the propagation in
place — `runCliDirect(['bogus-command'])` must set `process.exitCode = 2`.

**runCall improvements (per Gemini review)**

1. Skip `WatchedAllowlist` entirely for one-shot use — call the loader
   directly via `loadSendAllowlist(path)`. Avoids spinning up an FS watcher
   we immediately tear down.
2. Drop the redundant `await waitForInit(state)` after `await ensureProvider(state)`
   — `ensureProvider` already awaits init internally.
3. Catch `z.ZodError` explicitly in the executeTool catch block instead of
   regex-matching on error message text. More robust to Zod error format
   changes.

Net: 178/178 tests pass, 194/194 spec coverage, lint clean, end-to-end
smoke test against real Microsoft Graph confirms exit codes 0/2/3 now
reach the shell correctly and the new `call` surface integrates with the
PR #71 update_draft fix (a `call create_draft` → `call update_draft` →
`call read_email` flow preserves Graph's auto-quoted thread).
…mailbox_status

Codex peer review caught two correctness bugs in the new `call` subcommand
that the previous round of testing missed.

**Bug 1 (medium): --mailbox silently ignored**

`call` parsed `--mailbox` (the same flag `serve`/`watch` use) but never
forwarded it to the tool. Mailbox-sensitive tools route via
`resolveMailboxContext(state, input.mailbox)`, which reads from the parsed
tool input — not from CLI opts. Result: `email-agent-mcp call delete_email
--mailbox personal --args '{...}'` would silently target the default
mailbox. Real risk for write/delete operations.

Fix: merge `opts.mailbox` into `args.mailbox` when args don't already
specify one. In-args value still wins so explicit input is never
overridden by an ambient flag.

**Bug 2 (medium): eager init blocked the diagnostic tool**

`get_mailbox_status` is intentionally non-blocking — it inspects
`state.status` (pending/connecting/not_configured/error) and reports the
state as its result, by design. The eager-init gate I added in `runCall`
short-circuited it with exit 3 exactly when it would be most useful (e.g.,
checking why the provider is broken).

Fix: skip ensureProvider when the tool name is `get_mailbox_status`. This
is the only known non-blocking tool; if more are added, this can grow into
an annotation lookup.

Tests: 3 new scenarios under cli/Call Subcommand Mailbox Routing and
cli/Call Subcommand Diagnostic Tools. 181/181 tests pass, 196/196 spec
coverage. Verified live against real Microsoft Graph:
  - `call get_mailbox_status` returns the diagnostic state without exit 3
  - `call list_emails --mailbox bogus-name --args '{...}'` exits 3 with
    "Mailbox not configured" (instead of silently using the default)
@stevenobiajulu
Copy link
Copy Markdown
Member Author

Pushed 553f123 and 09bd482 after a peer-review pass (Codex + Gemini) and end-to-end smoke testing surfaced bugs.

Critical (smoke-test-found): bin discarded exit codes (regression now pinned)
The bin script packages/email-agent-mcp/bin/email-agent-mcp.js called runCli() and only handled thrown errors with process.exit(1) — non-zero return values were silently dropped. email-agent-mcp call no_such_tool returned 0 to the shell despite printing an error. Affected ALL subcommands; call is just where it first matters for piped/scripted use. Fix: route the bin through runCliDirect (now exported), which sets process.exitCode. Regression test added under cli/Direct entrypoint lifecycle.

High (Codex-found): schema validation mapped to exit 3 instead of 2
The catch in runCall was regex-matching error message text — Zod v4's actual message format starts with [\n {\n "expected"... which my regex didn't match, so action.input.parse(...) failures fell through to runtime-failure (exit 3) instead of CLI-error (exit 2). Fix: catch err instanceof z.ZodError explicitly.

Medium (Codex-found): --mailbox silently ignored
call parsed --mailbox but never forwarded it to the tool. Mailbox-sensitive tools route from args.mailbox, not from CLI opts, so call delete_email --mailbox personal --args '{...}' silently targeted the default mailbox. Real correctness risk. Fix: merge opts.mailbox into args.mailbox when not already present (in-args value wins).

Medium (Codex-found): eager init blocked get_mailbox_status
get_mailbox_status is intentionally non-blocking — the diagnostic tool that reports pending/not_configured/error state. My eager-init gate short-circuited it with exit 3 exactly when it would be most useful. Fix: skip ensureProvider when tool name is get_mailbox_status.

Cleanups (Gemini-found):

  • Skip WatchedAllowlist for one-shot use — call loadSendAllowlist(path) directly
  • Drop redundant await waitForInit(state) after await ensureProvider(state) (ensureProvider awaits init internally)

Verification:

  • 181/181 tests pass, 196/196 spec coverage, lint clean
  • End-to-end smoke against real Microsoft Graph: 14 distinct scenarios (--list, --schema, --args, --args-file, --args-stdin, jq integration, all four exit codes, regression for non-call commands, full create_draft → update_draft → read_email flow that integrates PR fix(graph): preserve auto-quoted thread history in update_draft #71's quoted-thread fix)

@stevenobiajulu stevenobiajulu merged commit edc999b into main May 3, 2026
13 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add call subcommand for one-shot CLI invocation of MCP tools

1 participant