Skip to content

ianliuy/agent-fleet

Repository files navigation

Agent Fleet — VS Code Extension

Prototype sibling-folder fork for testing a tree-first workbench: a compact hierarchy sidebar with role badges, summaries, and an inline inspector.

Before:

🤖 $\color{red}{Agent}$

After:

🤖 $\color{red}{Agent}$ .new( 🖥️ $\color{green}{Terminal}$ .new( 🤖 $\color{blue}{Agent}$ ) )

An agent opens a terminal. Inside, another agent wakes up. It opens more terminals. More agents wake up. All the way down.

while (true) {
  agent.createTerminal().launchAgent();  // you are here
}

The missing agent.fork() for Claude Code, Codex CLI, Gemini CLI, and Copilot CLI.

                          You (watching)
                               │
                          ┌────┴────┐
                          │ Agent 0 │  ← your Copilot CLI session
                          └────┬────┘
                    ┌──────────┼──────────┐
               ┌────┴────┐ ┌──┴───┐ ┌────┴────┐
               │ Agent 1 │ │ ...  │ │ Agent N │  ← each in a visible terminal
               └────┬────┘ └──────┘ └────┬────┘
            ┌───────┼───────┐          ┌──┴───┐
       ┌────┴──┐ ┌──┴───┐ ┌┴─────┐   │ ...  │
       │ 1.1   │ │ 1.2  │ │ 1.3  │   └──────┘  ← agents spawning agents
       └───────┘ └──────┘ └──────┘

Each agent can see, type, read, and spawn more of itself. You just watch.

Features

  • 🖥️ Create visible terminals — agent opens new terminal tabs you can see
  • ⌨️ Send commands — agent types commands, you watch in real-time
  • 📖 Read output — agent reads terminal output with cursor-based incremental reads
  • 🔑 Send keystrokes — Ctrl+C, arrow keys, function keys, etc.
  • 📸 Screenshot — capture current terminal screen content
  • 🔄 Multiple sessions — manage any number of terminals simultaneously
  • 🔒 Localhost only — server binds to 127.0.0.1, no remote access
  • 🌲 Hierarchical workbench — sidebar webview renders a tree-first agent hierarchy with window/group/manager/worker/subagent semantics
  • 🧭 Compact hierarchy navigation — collapse, pin, filter, subtree rollups, right-aligned role badges, and inline summaries for large orchestration trees
  • 📋 Inline inspector — inspect the currently selected node's status, children, terminal binding, and subtree health below the tree

Architecture

┌──────────────────────────────────────────────────────┐
│                    VS Code                           │
│                                                      │
│  ┌────────────────────────────────────────────────┐  │
│  │         Agent Fleet Extension               │  │
│  │                                                │  │
│  │  ┌──────────────┐    ┌──────────────────────┐  │  │
│  │  │  Terminal     │    │  MCP Server (SDK)    │  │  │
│  │  │  Manager      │◄───│  + HTTP transport    │  │  │
│  │  │  (VS Code     │    │  127.0.0.1:17580     │  │  │
│  │  │   API)        │    └──────────┬───────────┘  │  │
│  │  └──────────────┘               │              │  │
│  │                                 │ on activate: │  │
│  │      ┌──────────────────────────┼──────────┐   │  │
│  │      │ 1. Write mcp-config.json            │   │  │
│  │      │ 2. Write ~/.agent-fleet-port    │   │  │
│  │      │ 3. Register VS Code MCP API         │   │  │
│  │      └──────────────────────────┼──────────┘   │  │
│  └─────────────────────────────────┼──────────────┘  │
│                                    │                 │
│  ┌─────────────────────────────────┼──────────────┐  │
│  │  Terminal: copilot              │              │  │
│  │  ┌────────────────┐             │              │  │
│  │  │  Copilot CLI   │─────────────┘              │  │
│  │  │  reads mcp-    │  POST /mcp (JSON-RPC)      │  │
│  │  │  config.json   │                            │  │
│  │  └────────────────┘                            │  │
│  ├────────────────────────────────────────────────┤  │
│  │  Terminal: agent-controlled (visible to you)   │  │
│  │  Terminal: agent-controlled (visible to you)   │  │
│  └────────────────────────────────────────────────┘  │
│                                                      │
│  ┌────────────────────────────────────────────────┐  │
│  │  VS Code Copilot Chat (sidebar / agent mode)  │  │
│  │  Discovers via registerMcpServerDef... API     │  │
│  └────────────────────────────────────────────────┘  │
└──────────────────────────────────────────────────────┘

Installation

# Build
npm install && npm run build

# Package
npm run package

# Bump patch version, package, and install into regular VS Code
npm run install:local

Or press F5 in VS Code to launch the Extension Development Host for development.

Configuration

The extension auto-registers as an MCP server on activation. No manual setup needed.

It writes to ~/.copilot/mcp-config.json:

{
  "mcpServers": {
    "agent-fleet": {
      "type": "http",
      "url": "http://127.0.0.1:17580/mcp",
      "tools": ["*"]
    }
  }
}

It also writes ~/.agent-fleet-port with the port number for non-MCP discovery.

Extension Settings

Setting Default Description
agentFleet.port 17580 Preferred HTTP port (falls back to random if busy)
agentFleet.maxBufferSize 1048576 Max output buffer per terminal (bytes)
agentFleet.runnerAutoLaunch true Auto-launch and reconnect the local companion runner process on activation
agentFleet.logLevel info Log level: debug, info, warn, error

MCP Tools

The extension exposes three tool families:

  • Terminal tools (8): terminal_create, terminal_send, terminal_send_keys, terminal_type, terminal_read, terminal_list, terminal_close, terminal_screenshot
  • Runtime tools (3): runtime_type, runtime_send_keys, runtime_read
  • Graph tools (10): graph_create_group, graph_create_agent, graph_bind_agent, graph_import_terminal, graph_remove, graph_update, graph_move, graph_list, graph_stop_subtree, graph_retry

Terminal tools

Tool What it does Notes
terminal_create Create a visible VS Code terminal Returns both ephemeral terminalId and durable runtimeId metadata
terminal_send Write text directly to stdin Uses addNewline (default true)
terminal_send_keys Send control/navigation keys Good for interrupts and shell history
terminal_type Type through the xterm keyboard path Prefer for busy TUIs that need real typing semantics
terminal_read Read buffered output incrementally Supports since, waitMs, waitForOutput, waitForIdle, waitForString, raw, maxLines
terminal_list List tracked terminals Includes runtimeId, ownership, provenance, import policy, mode, and status
terminal_close Close a tracked terminal Clears the live binding
terminal_screenshot Return a text snapshot of recent output Uses maxLines

Graph tools

Tool What it does Notes
graph_create_group Create an organizational node Groups do not bind to terminals
graph_create_agent Create an agent node runtimeId is the canonical binding key
graph_bind_agent Bind/rebind an existing agent node terminalId is only an ephemeral fallback
graph_import_terminal Explicitly import a tracked terminal into the graph Promotes external/manual terminals to workbench/auto metadata before binding
graph_remove Remove a node/subtree Removes descendants too
graph_update Update label/role/status/summary Lightweight graph metadata edit
graph_move Reparent a node Supports sibling index
graph_list Read the hierarchy tree Returns terminal binding metadata on agent nodes
graph_stop_subtree Stop all bound terminals in a subtree For graph-managed terminals
graph_retry Recreate a stopped agent runtime, or launch an idle runner-backed agent Reuses the node's retained runtimeId when present

Runtime tools

These tools target the extension-owned runner/PTy execution plane rather than VS Code's built-in terminal tabs.

Tool What it does Notes
runtime_type Type text into a runner-backed runtime by runtimeId TUI-safe path for Copilot/Claude-style workers; optional submit presses Enter
runtime_send_keys Send control/navigation keys to a runner-backed runtime Use for Enter, Escape, Ctrl+C, arrows, etc.
runtime_read Read buffered output from a runner-backed runtime Supports since, waitFor*, maxLines, and optional raw ANSI output

Phase 2 runtime model

  • Tracked terminals and graph nodes are different things. The terminal tracker can know about a terminal even when no graph node exists for it yet.
  • Graph nodes are not terminal nodes. Agent nodes may bind to terminals, but the graph itself models agents/groups, not raw terminals.
  • runtimeId is the durable identity. terminalId is only the live VS Code handle for the current session/window.
  • External terminals are not auto-imported by default. They remain tracked-but-outside-the-graph until explicitly imported with graph_import_terminal.

Requirements

  • VS Code 1.93+ (Shell Integration API)
  • PowerShell or Bash with shell integration enabled
  • Node.js 18+ (bundled with VS Code)

How It Works

  1. On activation, the extension starts an HTTP server on 127.0.0.1:17580
  2. It registers itself in ~/.copilot/mcp-config.json so Copilot CLI discovers it
  3. The terminal manager adopts existing VS Code terminals into the tracked-terminal registry
  4. Persisted workbench graph/view state is rehydrated
  5. Startup backfill only auto-imports tracked terminals whose metadata says they belong in the graph
  6. Copilot CLI (or any MCP client) sends JSON-RPC requests to /mcp
  7. The extension creates/controls real VS Code terminal tabs via the VS Code API
  8. All terminals are visible — you see exactly what the agent is doing
  9. On deactivation, the extension unregisters and cleans up

Persistence and restore

  • The extension persists graph state + workbench UI state, not live PTY sessions.
  • Persisted agent nodes keep their runtimeId, but persisted terminalId bindings are cleared.
  • Nodes that were previously bound to running terminals come back as disconnected until they are rebound.
  • Startup reconciliation collapses duplicate runtimeId bindings and clears any stale live-terminal claims before the tree rehydrates against the current terminal registry.
  • Existing VS Code terminals are tracked again on startup, but adopted external terminals stay outside the graph unless they match auto-import policy or you explicitly import them.
  • Closing a bound terminal clears the live terminalId while keeping the node's runtimeId, which is what later retry/rebind flows use.

Smoke flow (Phase 2)

Use this flow to verify the current implementation without assuming more than the code does today.

  1. Install / reload
    • Run npm install
    • Run npm run build
    • Press F5 to launch the Extension Development Host, or run npm run install:local to bump the patch version and install the VSIX into regular VS Code
    • In the host window, run Developer: Reload Window once to verify activation + rehydration
  2. Confirm MCP registration
    • Check ~/.copilot/mcp-config.json for agent-fleet
    • Check ~/.agent-fleet-port or the Agent Fleet output channel if the preferred port was busy
  3. Create a workbench-owned runtime
    • Call terminal_create without overriding ownership metadata
    • Confirm terminal_list shows runtimeKind: "vscode-terminal", ownership: "workbench", provenance: "created" (or "recovered" when reusing a runtimeId), and graphImportPolicy: "auto"
  4. Bind by runtimeId
    • Call graph_create_agent or graph_bind_agent with the terminal's runtimeId
    • Confirm the node reflects runtimeId, runtimeKind: "vscode-terminal", and the current terminalId
    • If you later close/retry the terminal, confirm the node keeps the same runtimeId
  5. Verify restart / restore behavior
    • Reload the window
    • Confirm the graph reappears, but previously bound nodes restore as disconnected until rebound
    • Confirm tracked terminal count can differ from graph node count after reload
  6. Verify auto-import policy
    • Workbench-created or recovered terminals with graphImportPolicy: "auto" are eligible for startup backfill into the graph
    • Adopted/external terminals default to manual import and should remain tracked-but-outside-the-graph after reload
  7. Verify manual import / promote path
    • Open or keep an external terminal that the extension tracks
    • Call terminal_list and identify its terminalId
    • Call graph_import_terminal with that terminalId
    • Confirm the terminal is now represented by an agent node and the bound metadata is promoted into the workbench/auto path
  8. Verify focus behavior
    • terminal_send / terminal_send_keys preserve editor focus while sending input
    • terminal_type intentionally focuses the terminal because it uses the active-terminal typing path

Current policy summary

  • Auto-import on startup: only terminals that are already workbench-owned/recovered and marked graphImportPolicy: "auto"
  • Tracked but not in graph: adopted external terminals
  • Explicit import path: terminal_listgraph_import_terminal
  • Canonical rebind concept: runtimeId
  • Current VS Code terminal backend: runtimeKind: "vscode-terminal"
  • Phase 3 runner backend: runtimeKind: "runner-process" is now part of graph binding and the custom runtime panel flow
  • Phase 3 host bridge: the extension auto-discovers / auto-launches the local runner-process companion and keeps a reconnectable host-side bridge ready

Smoke flow (Phase 3 MVP)

Use this flow to verify the custom execution-plane slice introduced in Phase 3.

  1. Build the extension + smoke targets
    • npm run build
  2. Run the backend/runtime smoke
    • npm run smoke:phase3
    • This validates:
      • graph_create_agent-style idle runner intent
      • graph_retry launching a real runner-process shell
      • input/output round-trips against the live shell
      • runtime-level TUI-safe typing, keypress, and read flows against a second runner runtime
      • resize through the runner bridge
      • reconnect by rebuilding the host-side service against the same runner process
      • stop + retry while retaining the same runtimeId
  3. Manual panel sanity check in VS Code
    • Launch the Extension Development Host (F5)
    • Create or recover a runner-backed graph node
    • Use the sidebar's focus/open action on that node
    • Confirm the dedicated runtime terminal panel hydrates output, accepts line input, and reflects stop state changes

Development

# Install dependencies
npm install

# Build (one-time)
npm run build

# Watch mode (rebuild on change)
npm run watch

# Launch Extension Development Host
# Press F5 in VS Code

# Package for distribution
npm run package

# Local install into regular VS Code
# Automatically bumps 0.3.0 -> 0.3.1 -> 0.3.2 ...
npm run install:local

Project Structure

src/
├── extension.ts              # Extension entry point, lifecycle management
├── runner/
│   ├── client.ts             # Host-side JSON-RPC client for the companion runner
│   ├── launcher.ts           # Discovery + auto-launch/bootstrap for the local runner process
│   ├── service.ts            # Lightweight lifecycle/cache bridge owned by the extension host
│   ├── main.ts               # Companion runner entry point
│   └── hostBridgeSmoke.ts    # Narrow dev smoke for host↔runner round-trips
├── config/
│   └── autoRegister.ts       # MCP server registration (config file + VS Code API)
├── graph/
│   ├── graphManager.ts       # Persisted graph state + binding sanitization on save
│   ├── mcpTools.ts           # graph_* MCP tool registrations
│   ├── orchestrator.ts       # Agent graph lifecycle, import, bind, retry, startup backfill
│   └── workbenchStateStore.ts # Workspace-state persistence/rehydration
├── server/
│   ├── httpServer.ts         # HTTP server with auto-restart + socket tracking
│   └── mcpServer.ts          # MCP tool definitions (via @modelcontextprotocol/sdk)
├── terminal/
│   ├── manager.ts            # Terminal lifecycle, tracking, adoption, import metadata
│   ├── outputBuffer.ts       # Ring buffer with cursor-based reads + waitFor*
│   ├── shellIntegration.ts   # VS Code Shell Integration API wrapper
│   ├── pseudoTerminal.ts     # PTY mode for advanced terminal control
│   └── runtimeMetadata.ts    # ownership/provenance/graph import policy defaults
├── webview/
│   ├── agentTreeProtocol.ts     # Host/webview message contract
│   ├── agentTreeViewProvider.ts # Thin host bridge: lifecycle, sync, actions, terminal focus
│   └── agentTreeWebviewHtml.ts  # Bundled frontend payload (HTML/CSS/JS)
└── utils/
    ├── logger.ts             # Structured leveled logging to OutputChannel
    └── ansiStrip.ts          # Comprehensive ANSI escape sequence removal

Troubleshooting

Server not starting

Check the output channel: View → Output → Agent Fleet

Copilot CLI can't find the server

Verify the config: cat ~/.copilot/mcp-config.json

Expected:

{
  "mcpServers": {
    "agent-fleet": {
      "type": "http",
      "url": "http://127.0.0.1:17580/mcp",
      "tools": ["*"]
    }
  }
}

Port conflict

If port 17580 is busy, the extension auto-selects a random port. Check ~/.agent-fleet-port for the actual port, or look in the Output channel.

License

MIT

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages