Skip to content

Conversation

@lossyrob
Copy link
Owner

@lossyrob lossyrob commented Dec 3, 2025

[Workflow Handoffs] Intelligent Stage Navigation and Workflow Resumption

Summary

This PR implements the Workflow Handoffs feature, transforming PAW from a manual stage-navigation system into an intelligent, guided workflow with three automation levels. Previously, users had to manually navigate between stages—completing a specification, then navigating to the correct prompt file in .paw/work/<slug>/prompts/, and manually executing it. This created friction that hindered adoption, especially when returning to a workflow after days or weeks away.

With Workflow Handoffs, when a PAW agent completes its work, it presents clear, actionable next steps. Users can type simple commands like research or implement Phase 2 to instantly transition to the appropriate agent in a fresh chat session, with all context automatically carried forward through the Work ID.

Key capabilities introduced:

  1. Contextual Stage Transitions — Agents present formatted next-step options with commands like research, code, implement Phase 2, status. Users type simple commands to transition to appropriate agents.

  2. Three Handoff Modes — WorkflowContext.md includes a "Handoff Mode" field (manual/semi-auto/auto):

    • Manual: Full control - you command each stage transition
    • Semi-Auto: Thoughtful automation - automatic at research/review, pause at decisions
    • Auto: Full automation - agents chain through all stages (local strategy required)
  3. Enhanced Status Agent — Users can ask "where am I?" and receive comprehensive analysis including completed artifacts, current phase progress, git divergence, PR status, and actionable next steps.

  4. Dynamic Prompt Generation — Prompts are generated on-demand via paw_generate_prompt tool only when customization is needed, reducing filesystem noise.

  5. Inline Customization — Users provide inline instructions during handoffs (continue Phase 2 but add rate limiting) without creating prompt files.

  6. PAW: Get Work Status Command — Quick access to workflow status via Command Palette, with work item picker sorted by recency.

Related Issues

Artifacts

Implementation Phases

Phase Description PR
Phase 1 Handoff Tool Foundation #115
Phase 2 Status Agent Enhancements #116
Phase 3 Dynamic Prompt Generation #117
Phase 4 Agent Instruction Updates #119
Phase 5 Testing and Validation #120
Phase 6 On-Demand Prompt Generation #122
Phase 7 Status Command and Agent Rename #123

Planning PR: #114

Documentation Updates

  • Documentation PR: #124
  • README.md updated with Workflow Handoffs section, quickstart guide, and command documentation
  • Comprehensive Docs.md covering architecture, user guide, technical reference, and testing guide

Changes Summary

Key Changes

  • New Tool: paw_call_agent — Enables agent-to-agent transitions with Work ID context passing
  • New Tool: paw_generate_prompt — Creates customizable prompt files on demand
  • New Command: PAW: Get Work Status — Quick access to workflow status with work item picker
  • Agent Instruction Updates — All 14 PAW agents updated with {{HANDOFF_INSTRUCTIONS}} component
  • Status Agent RenamedPAW-X Status UpdatePAW-X Status for brevity
  • Handoff Mode Support — WorkflowContext.md now includes Handoff Mode field
  • On-Demand Prompts — Prompts no longer auto-generated at initialization

Files Created

  • src/tools/handoffTool.ts — Handoff tool implementation
  • src/tools/promptGenerationTool.ts — Prompt generation tool
  • src/commands/getWorkStatus.ts — Work status command
  • src/test/suite/handoffTool.test.ts — Handoff tool tests
  • src/test/suite/promptGenerationTool.test.ts — Prompt generation tests
  • agents/components/handoff-instructions.component.md — Shared handoff component
  • agents/components/review-handoff-instructions.component.md — Review workflow handoff component

Files Modified

  • All 14 agent files (agents/*.agent.md) — Added handoff instructions
  • src/extension.ts — Registered new tools and commands
  • package.json — Added tool definitions and command
  • src/ui/userInput.ts — Added Handoff Mode collection
  • src/prompts/workflowInitPrompt.ts — Added Handoff Mode field
  • README.md — Added comprehensive documentation

Testing

Automated ✅

  • All 107 unit tests pass: npm test
  • TypeScript compilation succeeds: npm run compile
  • Agent linter passes for all agents (with documented exceptions)

Manual Verification

  • Manual mode: User types research → Spec Research Agent opens with Work ID
  • Status Agent responds to "where am I?" with workflow state
  • PAW: Get Work Status command works from Command Palette
  • Inline instructions passed to target agents

Acceptance Criteria

From Spec.md:

  • ✅ SC-001: Manual mode stage transitions work within 2 seconds
  • ✅ SC-002: Semi-Auto mode auto-chains at research/review transitions
  • ✅ SC-003: Auto mode chains through all stages with tool approvals only
  • ✅ SC-004: Inline instructions pass to agents without prompt file creation
  • ✅ SC-005: Status Agent reports workflow state with git divergence
  • ✅ SC-006: Dynamic prompt generation creates correctly named files
  • ✅ SC-007: Missing prerequisites produce actionable error messages
  • ✅ SC-008: Directory scanning completes within 2 seconds for 10 workflows
  • ✅ SC-009: Help mode explains stage purposes
  • ✅ SC-010: Missing Handoff Mode defaults to manual with graceful fallback

Deployment Considerations

  • Backward Compatible: Existing workflows without Handoff Mode default to manual
  • No Migration Required: Workflows gain handoff capability on next agent invocation
  • Extension Activation: New tools registered alongside existing tools

Breaking Changes

None. All changes are additive and backward compatible.


🐾 Generated with PAW

- Update readiness checklist to identify Code Research Agent as next stage
- Clarify that Code Research comes before Implementation Plan in Stage 02
- Aligns with PAW specification workflow sequence
- Add principle: research questions document existing behavior only
- Add Research Question Guidelines section with decision-making workflow
- Clarify that design decisions belong in spec, not research
- Update examples to use generic software development scenarios
- Condense guidance to reduce token usage
Key changes based on review feedback:

1. Handoff Tool Simplification:
   - Use exact agent names as enum parameter (agents map user requests)
   - Remove procedural validation logic (agents validate prerequisites)
   - Remove inline instruction parsing (agents interpret user input)
   - Return empty string on success (new chat interrupts conversation)
   - Verify VS Code API parameter name for agent mode

2. Status Agent Redesign:
   - Remove status tool implementation
   - Update agent instructions to use existing tools for status detection
   - Agent determines status via file reads, git commands, GitHub MCP
   - Default behavior: help navigation (issue posting on request only)
   - Add help mode and multi-work-item support

3. Prompt Generation Simplification:
   - Use existing New PAW Workflow logic
   - Parameters: agent_name, optional additional_content
   - Remove phase-specific extraction (agents determine context)
   - Let agents intelligently compose additional_content

4. Agent Instructions Enhancement:
   - Add prompt generation option to handoff instructions
   - Users can generate customizable prompts instead of direct handoff
   - Agents map user requests to exact agent names
   - Agents validate prerequisites before handoffs

Aligns with PAW architecture: tools provide procedural operations,
agents provide decision-making logic and reasoning.
Make Status Agent instructions tool-agnostic by describing what to do
rather than which tools to use. Agent will determine appropriate tools
based on available capabilities.
[Workflow Handoffs] Planning: Intelligent stage navigation and workflow resumption
- Create paw_call_agent Language Model Tool for agent-to-agent transitions
  - Implemented HandoffParams interface with target_agent, work_id, and inline_instruction
  - Added Work ID validation matching /^[a-z0-9-]+$/ pattern
  - Invokes new chat using VS Code Language Model API (fire-and-forget)
  - Tool approval UI with clear message about target agent and work ID

- Add Handoff Mode field to WorkflowContext.md
  - New HandoffMode type: 'manual' | 'semi-auto' | 'auto'
  - Updated workflow initialization to collect handoff mode from user
  - Added validation: Auto mode requires local review strategy
  - Template includes Handoff Mode field in WorkflowContext.md

- Register tool in extension and package.json
  - Imported and registered registerHandoffTool in extension.ts
  - Added paw_call_agent tool definition with JSON schema in package.json
  - Tool description guides agents to intelligently map user requests to agent names

- Update README with Workflow Handoffs documentation
  - Added section explaining Manual/Semi-Auto/Auto modes
  - Documented stage transition commands (research, implement Phase 2, etc.)
  - Explained inline instructions syntax
  - Added note about dynamic prompt generation

- All tests pass (npm test)
- TypeScript compilation succeeds (npm run compile)
- No linting errors in new code
The getWorkspaceFolderPaths() function was accessing vscode.workspace.workspaceFolders
before checking the PAW_WORKSPACE_PATH environment variable. In test environments,
this caused timeouts because the VS Code workspace API may not be fully initialized.

This change checks for the environment variable override first, only accessing the
VS Code API if no override is provided. This allows tests to run efficiently while
maintaining production behavior.

Fixes the 'getContext loads workspace, user, and workflow content when present' test
timeout that was observed in the test suite (1 of 79 tests failing).
[Workflow Handoffs] Phase 1: Handoff Tool Foundation
… Agent

Added extensive PAW process knowledge to Status Agent to enable expert user guidance:

- New 'PAW Process Guide' section with 9 workflow stages (Specification through Final PR)
  including inputs, outputs, duration, commands, and mode-specific inclusion rules
- Workflow mode behavior comparison (full/minimal/custom) with stage lists and use cases
- Review strategy comparison (prs vs local) with branching patterns and tradeoffs
- Handoff automation levels (manual/semi-auto/auto) with auto-chain and pause points
- Artifact dependency detection logic with state→recommendation mappings
- Common user scenarios (new user, resuming, mid-workflow, multi-work management)
- Common errors with causes and resolution steps
- Natural language command mapping to agent names

Updated linter to support Status Agent's higher context needs:
- STATUS_AGENT_WARN_THRESHOLD=5000 (vs 3500 for other agents)
- STATUS_AGENT_ERROR_THRESHOLD=8000 (vs 6500 for other agents)
- Status Agent now 4864 tokens (within special threshold)

All automated verification passes:
✅ Agent linter passes
✅ 79 unit tests pass
✅ TypeScript compilation succeeds

Addresses: #116 (comment)
Added missing critical information from PAW specification:

1. Duration estimates for all stages (15-30 min for Spec, 20-40 min for Code Research, etc.)
2. Two-agent implementation workflow pattern explanation:
   - 03A (Implementation Agent): Makes changes, commits locally, never pushes
   - 03B (Implementation Review Agent): Reviews, documents, pushes, opens PRs
   - Clear separation of forward momentum vs quality gate responsibilities
3. PR review response workflow (03C/03D):
   - Stage 7 now documents PR Review Response process
   - 03A addresses comments with local commits → 03B verifies, pushes, posts summary
   - Workflow diagrams for initial development and review comment response
4. Planning PR clarification:
   - Stage 4 now explicitly lists Planning PR output
   - Planning branch (`<target>_plan`) noted in branching details
5. Enhanced navigation commands with PR review workflow steps

Status Agent grew from 4864 to 5549 tokens (within 8000 error threshold).

Verification:
✅ Agent linter passes (warns at 5549 tokens, under 8000 error threshold)
✅ TypeScript compilation succeeds
Workspace-level custom instruction for Implementation Review Agent ensures
paw-specification.md and Status Agent stay in sync.

Enforces bidirectional checks:
- Spec changes → verify Status Agent updated
- Status Agent changes → verify alignment with spec

Source of truth: Specification → Status Agent reflects spec for users.

Demonstrates PAW custom instructions capability for team-wide guidance.
Condensed Implementation Review custom instruction from detailed procedural steps to concise guidance:
- Removed verbose 'when reviewing X, verify Y' checklist format
- Consolidated bidirectional sync rules into bullet points
- Emphasized source of truth principle: spec → Status Agent
- Reduced from 48 lines to 17 lines while preserving essential requirements

Rationale: Custom instructions should provide high-level guidance, not duplicate agent instructions or create procedural checklists. The agent already has the intelligence to verify synchronization; the instruction just needs to tell it what to check for.
Enhanced PAW context component to emphasize custom instructions are CRITICAL:
- Added 'CRITICAL FIRST STEP' heading to signal importance
- Explained WHY skipping this matters (project policies, verification steps)
- Clarified precedence hierarchy explicitly (workspace > user > default)
- Framed as mandatory quality gate, not optional retrieval
- Added conflict resolution guidance

This ensures all agents understand custom instructions override defaults and must be checked before performing work.
…inimal mode

Corrected Status Agent to match paw-specification.md:
- Minimal mode INCLUDES Documentation stage (not skipped)
- Updated stage list: Code Research → Plan → Implementation → Docs → Final PR
- Artifacts now include Docs.md for minimal mode
- Clarified Spec/Spec Research are skipped (requirements assumed clear)
- Changed stage 8 from 'Skipped in: Minimal mode' to 'Required: All modes'

This resolves the inconsistency where:
- Spec Workflow Modes section said minimal includes Documentation
- Status Agent incorrectly said it was skipped

Spec is source of truth: Documentation is required for all workflow modes.

Token count: 5687 (within acceptable limits, +138 tokens from clarifications)
[Workflow Handoffs] Phase 2: Status Agent Enhancements
Prevents test runs from overwriting agents in the user's real prompts
directory. This is important when working on multiple PAW branches
simultaneously, as running tests on one branch would overwrite agents
installed from another branch.

The fix checks context.extensionMode === vscode.ExtensionMode.Test
and returns early from installAgentsIfNeeded() when in test mode.
The installer unit tests still work as they mock the extension context
and use temporary directories.
- Consolidate Core Specification Principles and Guardrails sections
  to eliminate redundant content (11 principles + 16 guardrails → 9 principles + 11 guardrails)
- Remove WorkflowContext.md creation logic (now handled by VS Code extension)
- Remove Work ID Processing section (normalization, validation, uniqueness handled by extension)
- Remove Work Title Refinement section
- Simplify Start section to reference paw_get_context tool instead of manual file reading
- Update WorkflowContext.md section to note it's created by extension, not agent

Token reduction: 8008 → 6357 tokens (saved ~1650 tokens)
…onent

- Add WorkflowContext.md fields table to paw-context.component.md
  explaining all fields agents receive from paw_get_context tool
- Remove redundant 'WorkflowContext.md Parameters' sections from 9 agents:
  PAW-01A, PAW-01B, PAW-02A, PAW-02B, PAW-03A, PAW-03B, PAW-04, PAW-05, PAW-X
- These sections duplicated field format and Work ID generation logic
  now handled by VS Code extension 'PAW: New PAW Workflow' command
- Simplify Start sections to reference paw_get_context tool call
- Condense Workflow Mode handling in PAW-01A Specification agent

Token reductions:
- PAW-01A: 6627 → 6358 (-269)
- PAW-01B: 3523 → 3112 (-411)
- PAW-02A: 6057 → 5645 (-412)
- PAW-02B: 7557 → 7188 (-369)
- PAW-03A: 6227 → 5846 (-381)
- PAW-03B: 6850 → 6467 (-383)
- PAW-04: 7025 → 6643 (-382)
- PAW-05: 4442 → 4059 (-383)
- PAW-X: 7372 → 7159 (-213)
Adds instruction to stage and commit WorkflowContext.md to the target
branch after it is created and checked out. This ensures the workflow
context is tracked in version control from the start of the work item.
…uccessful completion only and outline handling of blockers
Addresses the problem where agents in auto mode consistently fail to
auto-proceed because they're confused by competing instructions for
multiple modes in their system prompt.

Changes:
- Add parseHandoffMode() to extract Handoff Mode from WorkflowContext.md
- Add getHandoffInstructions() returning mode-specific behavior for
  manual, semi-auto, and auto modes
- Update formatContextResponse() to include <handoff_instructions>
  section at END of response for recency bias
- Simplify handoff-instructions.component.md by removing mode-specific
  behavior sections (now delivered via tool result)
- Add CRITICAL notice directing agents to reference the tool result
- Update paw-context.component.md to document handoff instructions
- Add comprehensive unit tests for parsing and instruction generation

Design rationale:
- Agents receive pre-parsed, mode-specific instructions reducing cognitive load
- Handoff instructions positioned at end for LLM recency bias
- Safe default to manual mode if parsing fails
- Command mapping and tool patterns remain in component for reference
…nges and propose tool-based instruction delivery
Addresses PR review comments from #126:

1. Comment on getHandoffInstructions: Move handoff instructions from embedded
   strings in contextTool.ts to template files in src/prompts/ directory.
   Created:
   - handoffManual.template.md
   - handoffSemiAuto.template.md
   - handoffAuto.template.md
   Similar pattern to workItemInitPrompt.template.md.

2. Comment on auto mode CRITICAL section: Added Failure Mode Exception to all
   three mode templates. Agents should NOT invoke handoff when blocked (merge
   conflicts, missing prerequisites, errors requiring user input). References
   similar text in handoff-instructions.component.md about presenting blockers
   and not adding Next Steps until resolved.

3. Comment on formatContextResponse empty check: Refactored the brittle check
   `sections.length === 1 && sections[0].includes('<handoff_instructions>')`
   to return early from the function when no actual context sections have
   content. Now checks hasWorkspaceContent, hasUserContent, hasWorkflowContent
   before building the response, and returns '<context status="empty" />'
   immediately if none have content.

Links:
- #126
Update the Final PR Handoff section to properly handle PR review comments
instead of marking it as a terminal stage with no further handoffs.

Changes:
- Add 'After Final PR opened - Handoff Message Rules' section
- Present two next steps: 'address comments' and merge options
- Include example handoff messages for both prs and local strategies
- Add 'Addressing Review Comments Flow' section showing the workflow:
  - address comments -> PAW-03A Implementer -> PAW-03B Impl Reviewer -> back to PR agent
- Move 'Terminal stage' note to clarify workflow ends when PR is merged, not when opened

This aligns with the Implementation Review Agent's handoff pattern and supports
the 'address comments' command added in the previous commit.
…alls

After merging main, the constructAgentPrompt function signature now requires
6 arguments (targetBranch, workflowMode, reviewStrategy, handoffMode, issueUrl,
workspacePath). Updated all 13 test calls in customInstructions.test.ts that
were missing the handoffMode parameter.

All 143 tests pass.
[Workflow Handoffs] Phase 8: Mode-specific handoff instructions via paw_get_context
@lossyrob lossyrob merged commit 592fcb6 into main Dec 4, 2025
1 check passed
@lossyrob lossyrob deleted the feature/workflow-handoffs branch December 4, 2025 20:23
@lossyrob lossyrob added the enhancement New feature or request label Dec 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement Contextual Stage Navigation for Workflow Handoffs Add PAW Workflow Status Capability

2 participants