feat(interrupt): implement interrupt system for human-in-the-loop workflows#784
feat(interrupt): implement interrupt system for human-in-the-loop workflows#784
Conversation
…kflows Add comprehensive interrupt support for agent execution pausing: - Implement Interrupt and InterruptState classes to manage interrupt lifecycle - Add interrupt() method to BeforeToolCallEvent and BeforeToolsEvent hooks - Add interrupt() method to ToolContext for use in tool callbacks - Handle InterruptError in agent loop to return stopReason: 'interrupt' - Return interrupts array in AgentResult for inspection The interrupt system allows agents to pause execution at specific points and resume with user responses, enabling human-in-the-loop workflows for approvals, confirmations, and other interactive scenarios. Resolves #479
- Export interrupt types from src/index.ts (Interrupt, InterruptError, InterruptState, InterruptParams, InterruptResponse, InterruptResponseContent) - Extract duplicate InterruptError handling into _createInterruptResult() helper - Add interrupt state check after BeforeToolsEvent/BeforeToolCallEvent yields to properly propagate hook interrupts - Implement name-based interrupt matching in getOrCreateInterrupt() for resume across model calls with different tool use IDs - Update jsdoc for _interruptCounter to document serialization exclusion - Add e2e tests for interrupt → response → continue resume flow - Add tests for name-matching behavior in InterruptState Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
…l results When resuming from an interrupt, the agent now: - Skips re-calling the model and uses the stored assistant message - Preserves completed tool results to avoid re-executing successful tools - Only executes the tool that was interrupted (and any remaining tools) Example scenario: If tools A, B, C are requested and A & B succeed but C interrupts, on resume only C executes - A and B are skipped. Implementation: - Add PendingToolExecution interface to store assistant message and completed results - Store pending state when interrupt occurs during tool execution - On resume, check for pending state and skip model invocation - Pass completed tool results to executeTools to skip already-completed tools - Clear pending state when user sends a new message (abandons interrupted flow) This addresses the reviewer feedback that resume should jump directly to tool execution without re-calling the model.
…ptState
Move the logic for reconstructing assistant message and completed tool
results from agent.ts into InterruptState.getPendingExecution() method.
This provides better encapsulation and makes the agent code cleaner:
- Before: inline reconstruction with Message.fromMessageData() and loop
- After: this._interruptState.getPendingExecution()
The method returns { assistantMessage, completedToolResults } or undefined
if no pending execution exists.
|
Assessment: Request Changes This is a well-designed implementation of the human-in-the-loop interrupt system with comprehensive tests. The API design is intuitive and the code follows project patterns well. Review Categories
The interrupt mechanism itself is well-thought-out, particularly the handling of partial tool execution and resume semantics. |
| // InterruptResponseContent[] passes through to the agent — cannot be merged with deps | ||
| if (Array.isArray(input) && input.length > 0 && isInterruptResponseContent(input[0])) { | ||
| return input | ||
| } |
There was a problem hiding this comment.
I would recommend leaving this out for now and handle graph interrupts separately. This is just one piece of a larger change.
There was a problem hiding this comment.
Originally it was added to workaround the type-checking, but moving to exclude via type system instead
| * }) | ||
| * ``` | ||
| */ | ||
| interrupt<T = unknown>(params: InterruptParams): T { |
There was a problem hiding this comment.
In Python, we setup an _Interruptible protocol (here). This allowed us to define shared logic for the hook events to derive. Most notably, it defines the interrupt method. Could we do something similar in TS to help avoid duplication.
There was a problem hiding this comment.
Adding interface + shared method; I don't want to add a new base class for just this method and mixins (or protocols) aren't a thing in TS as much as Python
| agent: LocalAgent | ||
| toolUse: { name: string; toolUseId: string; input: JSONValue } | ||
| tool: Tool | undefined | ||
| interruptState?: InterruptState |
There was a problem hiding this comment.
I think it would make sense to manager this under LocalAgent that way we can access through agent.interruptState. I understand though we would have to make it a public attribute which maybe we want to hold off on. Was that your thinking and the reason why you pass interruptState separately?
There was a problem hiding this comment.
I understand though we would have to make it a public attribute which maybe we want to hold off on.
I don't want to make this public as to me it's an internal implementation detail. We can use duck typing/casting to access it if we want though.
Was that your thinking and the reason why you pass interruptState separately?
It was more of a "We already have access to it here, why not pass it in directly without a cast". That said, I know in other places we use the cast so I think I'll migrate to that instead which leaves us open to a better two way door
| } catch (error) { | ||
| if (error instanceof InterruptError) { | ||
| const interruptResult = this._createInterruptResult() | ||
| yield await this._invokeCallbacks(new AgentResultEvent({ agent: this, result: interruptResult })) | ||
| return interruptResult | ||
| } | ||
| throw error |
There was a problem hiding this comment.
Do we need this catch? Don't we handle interrupt errors gracefully in streamGenerator?
There was a problem hiding this comment.
Remove this one; I think this was from an earlier revision when error handling wasn't figured out.
According to Kiro this would be effective for some cases Like BeforeInvocationEvent, but those are not a concern for today
| * User's response to the interrupt. | ||
| * Can be any value that the hook or tool expects. | ||
| */ | ||
| response: unknown |
There was a problem hiding this comment.
Does this need to be serializable? Should we use JSONValue?
There was a problem hiding this comment.
yeah; changed it to JSONValue; same for reason
| if (collectedInterrupts.length > 0) { | ||
| const seen = new Set<string>() | ||
| for (const interrupt of collectedInterrupts) { | ||
| if (seen.has(interrupt.name)) { |
There was a problem hiding this comment.
nit: can we gather all of the duplicates and then raise the error?
|
|
||
| expect(beforeTools).toEqual( | ||
| new BeforeToolsEvent({ | ||
| expect.objectContaining({ |
There was a problem hiding this comment.
Just curious, why doesnt the class work here?
There was a problem hiding this comment.
It was the private field that was added to BeforeToolsEvent; but with that change gone, we can revert
| throw new Error('Interrupt state not available') | ||
| } | ||
|
|
||
| const interruptId = `beforeTools:${params.name}` |
There was a problem hiding this comment.
Do we need a more unique id here? Im thinking of an example like this:
Imagine the user set up an interrupt in the BeforeTools event with id of ConsentToTools.
If the agent executes, then decides to invoke a tool, this interrupt with trigger with an id of: beforeTools:ConsentToTools.
If the user responds to this interrupt, the agent loop will get to this BeforeTools event, pass the interrupt, call the tools, and continue the agent loop. Then the model decided to call a tool again. This previous interrupt will still be in interrupt state, so this will be bypassed without another interrupt.
Im thinking of maybe including all of the upcoming tooluseids in this id, so that its unique for each turn. What do you think?
There was a problem hiding this comment.
The interrupt state should be cleared before that next loop and so this id would not have to be more unique. You can see here in Python we clear the interrupt state immediately after tool execution so that another internal loop will lead to another interrupt (unless of course the user is caching themselves with agent.state).
I'm scanning the PR right now to see if similar logic is present.
There was a problem hiding this comment.
It doesn't appear that interrupt state is cleared after tool calls. We clear at the end of invocation. So we either need to clear after tool calls or add a more unique identifier as suggested. Using the batch of tool use ids could work because it is deterministic.
There was a problem hiding this comment.
Good catch
So we either need to clear after tool calls or add a more unique identifier as suggested.
Clearing after tool executions. It doesn't make sense to me to add ids given that they should always be cleared and because for other interruptible events in the future (like model invocation, if we supported) we don't have a similar id, so this pattern makes sense to me
| return { | ||
| id: this.id, | ||
| name: this.name, | ||
| ...(this.reason !== undefined && { reason: this.reason }), |
There was a problem hiding this comment.
Similar to the other comment, but should these be unknown if we want to be able to JSON serialize them?
|
General feedback from my review agent: DetailsCritical IssuesIssue 1:
|
@Unshure any items you feel should be addressed/prioritized? Addressed the snapshot one; that was a good catch |
# Conflicts: # src/agent/__tests__/snapshot.test.ts # src/agent/snapshot.ts
| * Tool results that were completed before the interrupt. | ||
| * Maps toolUseId to serialized ToolResultBlock data. | ||
| */ | ||
| completedToolResults: Record<string, ContentBlockData> |
There was a problem hiding this comment.
This should be ToolResultBlockData.
| * | ||
| * Interrupt state is cleared after resuming. | ||
| */ | ||
| export class InterruptState { |
There was a problem hiding this comment.
Nit: Should this implements InterruptStateData
|
|
||
| const completedToolResults = new Map<string, ToolResultBlock>() | ||
| for (const [toolUseId, resultData] of Object.entries(this.pendingToolExecution.completedToolResults)) { | ||
| const block = contentBlockFromData(resultData) |
There was a problem hiding this comment.
We have a toolResultContentFromData as well.
| * }) | ||
| * ``` | ||
| */ | ||
| interrupt<T = unknown>(params: InterruptParams): T |
There was a problem hiding this comment.
Should the default be JSONValue instead of unknown?
There was a problem hiding this comment.
Note that this could break users mocking ToolContext for unit tests on custom function tools. I'm not so concerned about that though since we are rc right now.
| : input.map((b) => ('type' in b ? (b as ContentBlock) : contentBlockFromData(b))) | ||
| : (input as Exclude<typeof input, string>).map((b) => | ||
| 'type' in b ? (b as ContentBlock) : contentBlockFromData(b as ContentBlockData) | ||
| ) |
There was a problem hiding this comment.
Is this necessary with the changes to the MultiAgentInput definition?
| * Accesses the agent's interrupt state to register or resume an interrupt. | ||
| */ | ||
| function _interruptFromAgent<T>(agent: LocalAgent, interruptId: string, params: InterruptParams): T { | ||
| const interruptState = (agent as unknown as { _interruptState?: InterruptState })._interruptState |
There was a problem hiding this comment.
This awkward casting seems to be a sign that we need to find a way to expose internal state publicly? We can figure that out in a follow up though.
| // If user sends a regular message (not interrupt responses), clear any pending state | ||
| // This allows the user to "abandon" an interrupted workflow and start fresh | ||
| this._interruptState.clearPendingToolExecution() | ||
| this._interruptState.deactivate() |
There was a problem hiding this comment.
This works here in TS because we don't add tool uses until we have the tool results correct? In Python, this would not work because we would have a unanswered tool uses in the messages array.
| let structuredOutputChoice: ToolChoice | undefined | ||
|
|
||
| // Emit event before the try block | ||
| yield new BeforeInvocationEvent({ agent: this }) |
There was a problem hiding this comment.
What events fire on interrupt and resume is something we should make more clear in the docs. In Python, we skip the AfterToolCallEvent if interrupting a tool. The motivation here is that we don't have a tool result to feed it. We do however still emit AfterInvocationEvent. We also emit the before events on resume including BeforeToolCallEvent. This means we emit a BeforeToolCallEvent without a pairing AfterToolCallEvent. Maybe we want to change the behavior for TS?
There was a problem hiding this comment.
Maybe it would also be worth adding some sort of field to the Before events that indicates whether or not the agent is resuming from an interrupt. There might be certain things a users doesn't want done if just resuming from an interrupt. This could be considered for follow up though.
There was a problem hiding this comment.
We should consider emitting an InterruptEvent. In Python, we have a ToolInterruptEvent, but maybe we can generalize. This could be considered for follow up though.
| * @param reason - Optional reason for the interrupt | ||
| * @returns The interrupt (may have a response if resuming) | ||
| */ | ||
| getOrCreateInterrupt(id: string, name: string, reason?: JSONValue): Interrupt { |
There was a problem hiding this comment.
In Python, we allow a user to pass in a response from within their hook or tool definitions. This is to skip the interrupt if they already have a response ready. For example, they might raise an interrupt in BeforeToolCallEvent and save the response in state for future invocations ([details](Can users provide a preemptive response)). Not absolutely necessary but customers can get slightly cleaner code:
// not supported
agent.addHook(BeforeToolCallEvent, (event) => {
if (event.toolUse.name !== 'delete_files') return
const cached = event.agent.appState.get('approval')
if (cached) return
const response = event.interrupt({ name: 'approval', reason: 'Confirm deletion?' })
event.agent.appState.set('approval', response)
if (response !== 'y') {
event.cancel = 'User denied permission'
}
})// supported
agent.addHook(BeforeToolCallEvent, (event) => {
if (event.toolUse.name !== 'delete_files') return
const cached = event.agent.appState.get('approval')
const response = event.interrupt({ name: 'approval', reason: 'Confirm deletion?', response: cached })
event.agent.appState.set('approval', response)
if (response !== 'y') {
event.cancel = 'User denied permission'
}
})Not sure if it is worth it anymore. Can follow up later. Definitely doesn't need to be considered for this PR.
There was a problem hiding this comment.
Since interrupts are an internal mechanism, the unit tests should offer sufficient coverage. Still, I think it would be good to at least integ test the happy paths. We do this in Python (here). Could be considered for follow up.
| lastMessage, | ||
| traces: this._tracer.localTraces, | ||
| metrics: this._meter.metrics, | ||
| interrupts: this._interruptState.getInterruptsList(), |
There was a problem hiding this comment.
We should only be returning the interrupts not yet responded to. This is what we do in Python (here). We can get into this situation in a few ways:
- Customer raises interrupt in BeforeToolsEvent and then again in BeforeToolCallEvent.
- Customer raises multiple interrupts on a single event (can only be processed one at a time).
- Customer doesn't respond to all the interrupts the first time around.
| yield new BeforeInvocationEvent({ agent: this }) | ||
|
|
||
| // Normalize input to get the user messages for telemetry | ||
| const inputMessages = this._normalizeInput(args) |
There was a problem hiding this comment.
It seems like we are allowing users to mix in other content block types with their interrupt responses (in Python we raise an exception). It looks like if interrupt response isn't the first content item, we could end up populating inputMessages which get appended to the messages array down below (through a second call to normalizeInput).
Description
Adds an interrupt system for human-in-the-loop workflows. Agents can pause execution within tool callbacks,
BeforeToolCallEventhooks, orBeforeToolsEventhooks to collect user input before proceeding.The
interrupt()method halts the agent loop on first call, returningstopReason: 'interrupt'with aninterruptsarray. When the caller resumes withInterruptResponseContentobjects, the sameinterrupt()call returns the user's response instead of halting.Multiple hook callbacks on the same event can each raise their own interrupt. The registry collects all interrupts across callbacks before halting, and duplicate interrupt names are rejected. This matches the Python SDK's behavior.
Both assistant and tool result messages are appended only after tool execution completes, preventing dangling
toolUseblocks without matching results. When an interrupt fires mid-batch, completed tool results are preserved so the agent skips the model call on resume and only executes remaining tools. Hook-level interrupts (fromBeforeToolCallEvent/BeforeToolsEvent) also store pending execution state, so resume skips the model call just like tool-level interrupts.Public API Changes
End-to-end usage
New types and exports
Interruptclass — interrupt data returned inAgentResult.interruptsInterruptParams,InterruptResponse,InterruptResponseContenttypesInvokeArgsnow includesInterruptResponseContent[]as a valid input typeToolContextnow includesinterrupt<T>(params: InterruptParams): TInterruptErrorandInterruptStateare internal and not exportedRelated Issues
Documentation PR
Type of Change
New feature
Testing
npm run checkChecklist
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.