-
Notifications
You must be signed in to change notification settings - Fork 288
fix(event_loop): ensure tool_use content blocks are valid after max_tokens to prevent unrecoverable state #607
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(event_loop): ensure tool_use content blocks are valid after max_tokens to prevent unrecoverable state #607
Conversation
Co-authored-by: Nick Clegg <[email protected]>
Co-authored-by: Nick Clegg <[email protected]>
This PR moved to a new approach. Rather than using Hooks we leverage the ConversationManager. There is existing art for these exceptional EventLoop cases in the ContextWindowOverflowException handling. This avoids bloat in the Agent constructor. It also now automatically applies the handling as it is applied in the default ConversationManager, SlidingWindowConversationManager. More importantly however, Strands takes the position that hooks should be reserved for augmenting or manipulating the Agent, at the discretion of the implementor. It should not be used for cases like this where there is a known issue that the SDK handles itself. Internally, hooks may be leveraged. But the way they are applied for cases like these should not be by requiring the user to pass A Missing from this PR is telemetry. #617 has been created to equip the builtin conversation managers. |
Description
Overview
This PR introduces a new event handling mechanism for managing event loop failures, specifically focusing on the MaxTokensReachedException case.
When an LLM is generating a tool_use content block but hits max_tokens. It leaves the agent in an unrecoverable state. To address this we made a change to raise an exception in #576, This prevented the problematic message from being added to the messages array.
This required implementors to re-implement the same filtering logic. In this PR we apply the filtration by default. This will prevent issues like the one called out in #541 by default. A follow up PR will allow this filtration/transformation to be more configurable.
After this PR, if a MaxTokensReachedException is hit. An agent can be restarted. But, it still terminates. This is because the stopReason is still
max_tokens
.Some things to note about ordering
Related Issues
#541
#576
#561 - until this is completed, we will need to overwrite ALL tool_use content blocks. After it is we can overwrite only known broken tool uses.
Documentation PR
Follow up after approval
Type of Change
Bug fix
Testing
How have you tested the change? Verify that the changes do not break functionality or introduce warnings in consuming repositories: agents-docs, agents-tools, agents-cli
hatch run prepare
Checklist
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.