Skip to content

Commit 592fcb6

Browse files
authored
Merge pull request #125 from lossyrob/feature/workflow-handoffs
[Workflow Handoffs] Add intelligent stage navigation and workflow resumption
2 parents e386be5 + 7d8830f commit 592fcb6

File tree

70 files changed

+7076
-1957
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

70 files changed

+7076
-1957
lines changed

.github/copilot-instructions.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,3 +17,5 @@ All pull requests to `main` must be labeled with one of the following labels:
1717
- `bug` - For bug fixes
1818
- `documentation` - For documentation changes
1919
- `maintenance` - For maintenance, refactoring, or chores
20+
21+
IMPORTANT: **PAW Architecture Philosophy** - tools provide procedural operations, agents provide decision-making logic and reasoning. Rely on agents to use reasoning and logic over hardcoding procedural steps into tools.
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
# Implementation Review Custom Instructions
2+
3+
## Keep PAW Spec and Status Agent Synchronized
4+
5+
When reviewing changes to `paw-specification.md`, verify the Status Agent (`agents/PAW-X Status Update.agent.md`) is updated to match:
6+
- New/changed workflow stages, modes, or review strategies
7+
- Updated branching patterns or agent responsibilities
8+
- New artifacts or deliverables
9+
10+
When reviewing changes to the Status Agent, verify it aligns with `paw-specification.md`:
11+
- Stage descriptions match spec definitions
12+
- Workflow mode/review strategy behaviors reflect spec
13+
- Duration estimates and navigation commands are accurate
14+
15+
**Source of truth**: `paw-specification.md` → Status Agent reflects the spec for user guidance.
16+
17+
If synchronization is missing, document the gap and request Implementation Agent address it in a follow-up commit.

.paw/work/workflow-handoffs/CodeResearch.md

Lines changed: 476 additions & 0 deletions
Large diffs are not rendered by default.

.paw/work/workflow-handoffs/Docs.md

Lines changed: 438 additions & 0 deletions
Large diffs are not rendered by default.

.paw/work/workflow-handoffs/ImplementationPlan.md

Lines changed: 1368 additions & 0 deletions
Large diffs are not rendered by default.

.paw/work/workflow-handoffs/Spec.md

Lines changed: 264 additions & 0 deletions
Large diffs are not rendered by default.

.paw/work/workflow-handoffs/SpecResearch.md

Lines changed: 965 additions & 0 deletions
Large diffs are not rendered by default.
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
# WorkflowContext
2+
3+
Work Title: Workflow Handoffs
4+
Feature Slug: workflow-handoffs
5+
Target Branch: feature/workflow-handoffs
6+
Workflow Mode: full
7+
Review Strategy: prs
8+
Issue URL: https://github.com/lossyrob/phased-agent-workflow/issues/69
9+
Planning PR: https://github.com/lossyrob/phased-agent-workflow/pull/114
10+
Remote: origin
11+
Artifact Paths: auto-derived
12+
Additional Inputs: none
13+
14+
Note: As part of this work we will also be implementing https://github.com/lossyrob/phased-agent-workflow/issues/60 - add this to the spec and implementation plan.
15+
16+
Note: PAW-01A Specification.agent.md exceeds the 6500 token lint error limit (currently ~6831 tokens). This is a pre-existing issue that will be addressed separately. The agent linter failing for this file is expected and OK during this work item.
Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
These are voice notes that were recorded when reviewing phase four of the implementation.
2+
3+
I have some comments on the PR about the handoff scenarios and so I just want to talk those through so I can think through the complete. Hand off set. For the specification agent, as I said in my comment, umm, there's a next stage that is either the code researcher. If the code research is, I mean sorry if the spec research is done and there's no sort of like if if if the spec is completed, then the next days would be code researcher. However, users can run the specification which can then generate a spec research that the spec research agent needs to go and run. And so that next stage is conditional about the state of whether this is like going to go to spec research, which then goes back to the spec Asian or if the spec generation is completed and now we're moving on to the code research.
4+
5+
For the implementation planner agent, there's this like for the planning hand off, it says all modes pause after planning that. We don't have to say that. It's gonna pause when it talks about the next step, so. Really, it's. And and yeah, so the stage is not. Definitely going to move on to the implementation phase one, because in the PR mode the user could make PR comments, in which case they asked the implementation planner to address. Those comments, right. So it's the same agent, it hands off sort of to itself, but in the review context where it's, it's looking at the PR and addressing the review comments. Or if the PR step is, you know, if it's not review strategy PRS, it's review strategy local, then it can, you know, it's basically just saying like ask me or like. Yeah, you can say there yeah there's just no review PR comment. What would what would happen is that the user would just chat with that implementation planner agent to make updates to the plan. So the hand off can mention something like you know ask me to update the plan. Or umm. We can move on to implementation and then that's where the options of like implement, which it shouldn't. It shouldn't require the user to say implement phase one. It should just say implement and then the implementer agent will figure out what phase to work on. Similarly, like the generate prompt for implementer phase one is too verbose, right? It should say. You know, some sort of like terse language like generate implementation prompt. Or generate implementation prompt file or something like that, and then you know if the user ends up saying generate implementation prompt file for phase 4, right, it will do that like the IT should. It should be smart enough to like know how to name those. So if there's not instructions on how prompt files are named, then we might need to include that.
6+
7+
For the implementer agent and maybe elsewhere where it says semi auto auto immediate handoff, we probably want to specify that's immediate hand off to reviewer. Just because there's two options, review and status, I think that the agent probably figure out which one was more appropriate, but might as well make that explicit. It also says after addressing review comments, hand off to Paw 03 B which is not a full agent name. So unless the agent unless the agent name is like mapped somewhere else. It should say the full agent name. When the user says an agent name, it should be a short name that should be like reviewer, not. Not anything that has like Paw or the numbers in it. It should be like a friendly name where it's you know. Spec. Reviewer. And if we're in the spec phase, it would know that it means spec reviewer. Oh, sorry, no, there's no spec reviewer. It should be spec researcher, and if it's in the spec phase then we know that's a spec researcher. Umm, planner, right? If it's the implementation planner, I should just be able to say planner. Like implementer, reviewer. So if we're in an implementation phase, it should mean the implementation reviewer. Docs PR these short phrases that are shipped from the context make the agent understand what we're talking about. In the implementation reviewer agent, it says all modes pause wait for human PR review. If it's not PR strategy, then that's not true. So if it's the review strategy PRS, yes, that like it'll have to stop there. If it's automatic then the handoff just goes to the next phase. So. That should be clear.
8+
9+
For the documentary agent. The documentation agent should understand the review process as well and so it should be able to like hand off to itself with a review prompt that says, you know, review the comments on the PR if we're in the PR phase or review strategy for the PR review strategy. I should be able to hand off to the documenter with the with the review sort of mode enabled so that it updates based on the. PR comments.
10+
11+
In the PR agent it says address review comments on PR number yadda yadda. I don't want to specify the PR. I wanted to know how to find that so and it already does so it should just be review again just like a short one word review. So if I'm in the PR agent and I say review. Or yeah, or I say review PR, right. It should be clear that what that means is to go to the implementer agent and with the expectation that there's APR, review. If I'm in the review strategy, P RS.
12+
13+
For the understanding agent, the umm. Next steps are conditional based on whether there's the, the. Baseline researcher has done its work right, so it needs to be. Not just going immediately to the impact analysis, right? If the baseline research is not done, then it will have created a prompt and that has to get handed off to the baseline researcher with specific instructions to run the prompt, that is. At the path that the relative path of that prompt file. That's important. And that's the same thing with the spec researcher. The spec researcher. I'm sorry, the spec agent. The spec agent can. Output a prompt file that, when run with the spec researcher, needs to hand off with that specific path, so it's important to make sure that that path is correct to the prompt file that was generated.
14+
15+
For the review phases, there's a pretty straightforward next step. And actually this might be true in a number of different cases. So we need to reason about the handoffs and if the user says continue to go with the sort of most reasonable next step, you know as if it were in auto or semi auto mode. If the user says continue, just do the next stage. That is like sort of the default stage.
16+
17+
18+
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
These are voice notes about a refactor that I want to do at the final PR stage.
2+
3+
After doing manual testing, I'm finding that the handoff scenario is hard for the agents to get right when in auto mode. The agents are constantly pausing and acting as though it's in manual hand off mode. Telling me what the next steps are and not proceeding, and I ask the agents to inspect their instructions and try to determine why they didn't comply. And the answer is usually that it's buried in the documentation and they just they didn't get it right. I've updated the. Handoff instructions and agents a few times and it doesn't seem to be. Getting it. And this is with Claude Opus 4.5.
4+
5+
I'm thinking of taking an alternate approach to the handoff, which is to instead of having the instructions about both a manual hand off and an auto hand off in the system prompt in the agent prompt. To instead in the agent prompt say that when ready to hand off, call a tool. And that tool will give the handoff instructions. And the agent would call the tool with the handoff mode and its agent name and then we would just have pre canned handoff instructions. Umm, that can just be loaded and sent to the agent?
6+
7+
This might be tricky for the custom workflow mode, where we're actually not sure what the stages are. So having the instructions about, you know, is this a handoff scenario? And here's the like pattern of agents talking to each other. I think that makes sense to kind of give as one. Like no matter what the agent is like. Here's the handoff instructions. But if it's, if it's in auto mode, have a different set of instructions, and what that would likely do is just make it so that the agent is confused by two different sets of instructions based off of the mode.
8+
9+
I'm wondering if we can just actually stuff this into the Paw context. So that we don't have another tool call umm. And then just for the handoff instructions to refer back to the Paul context return value. For that tool call. That should be enough. It'll be in the message history and. The agent should be able to focus its attention on that tool call result. So I actually think that might be the way to go.
10+
11+
The one tricky thing about that is the. Workflow context is returned in that tool call, so the agent can't pass in what the handoff mode is, so that would actually have to be parsed out of the workflow context. I've been avoiding parsing things out of the workflow context just because it can be a, you know, free form document that the agent will interpret and reason about. However, in this case I think we should just have safe parsing and. If the handoff mode can't be determined by the parsing to just default to manual. And if that's, that should be a pretty rare. Edge case where the hand off mode was edited and it is not possible and so. Many wants auto. We should just log to the output channel that umm. You know if it's not correct. If the if you want to fix it then just set the handoff mode to Auto, semi, Auto or manual like. Give a little bit of instruction on how to modify the workflow context to fix it.
12+
13+
This will require changes in the Handoff component and to have the pocket context. Have another. Section in it. But I think that'll be more clear for the agents and I'll test it and see if that's the case.

0 commit comments

Comments
 (0)