feat: Make max iteration limit resumable without reminders#2453
feat: Make max iteration limit resumable without reminders#2453utkarsh-in wants to merge 7 commits intoOpenHands:mainfrom
Conversation
- Add MAX_ITERATIONS_REACHED status to ConversationExecutionStatus enum - Mark MAX_ITERATIONS_REACHED as terminal state - Add ConversationIterationLimitEvent for better error handling - Modify send_message() to reset MAX_ITERATIONS_REACHED to IDLE on new user input - Modify run() to allow restarting from MAX_ITERATIONS_REACHED state - Remove final step reminder message injection when max iterations reached - Add budget information to agent system prompt - Add budget warning messages at 80% and 95% of max iterations - Update tests to cover new status transitions This change allows conversations to resume after hitting max iterations when a new user message is sent, instead of permanently stopping. Removes the intrusive reminder message that was previously injected.
…ify budget warning logic
8deea23 to
de58827
Compare
Co-authored-by: openhands <openhands@all-hands.dev>
| f"{self.max_iteration_per_run} steps. " | ||
| f"{self.max_iteration_per_run - iteration} " | ||
| "steps remaining. Begin wrapping up and " | ||
| "provide your best answer." |
There was a problem hiding this comment.
To note, I don't believe we've ever done this: "begin wrapping up and provide... etc". The reminder never changed, it was only informative e.g. "you're at iteration m out of n".
There was a problem hiding this comment.
I actually sent OpenHands to verify what I said, and here is its response. Oh, it wasn't even applied for tool calling, meaning it wasn't applied for ~SOTA LLMs since ~October 2024 or so.
I checked the legacy V0 CodeAct code.
Short answer: no, the live reminder did not push the agent to finish faster. The actual injected text was just the remaining-turn count plus a finish-format hint.
What I found:
- The helper that formats the reminder is in:
openhands/utils/prompt.py:136-150- Its text is:
ENVIRONMENT REMINDER: You have {N} turns left to complete the task. When finished reply with <finish></finish>.Historically, that reminder was actually injected from CodeAct here:
- older
CodeActAgent._get_messages()had:
if not self.function_calling_active: self.prompt_manager.add_turns_left_reminder(messages, state)So in the old CodeAct path, it only applied in the non-function-calling mode.
Did it ever say more than just the number?
- In commit
e27a5ef75(“In-context reminder”),add_turns_left_reminder()was temporarily expanded to also prepend a workflow reminder every 10 steps.- But that extra text was things like:
- follow the workflow
- test thoroughly
- hidden tests exist
- quality/completeness matter more than brevity
So even there, it was not “hurry up” or “finish faster” language. If anything, it nudged toward more thoroughness, not speed.
Important
That extra workflow-reminder version appears to have been in dead code by then:
- the call site that actually injected
add_turns_left_reminder()was removed earlier in commit07f0d1ccb- after that, the helper still existed, but I did not find a live call to it in CodeAct
Net conclusion
- Live old CodeAct injection: just “you have N turns left … when finished reply with
<finish></finish>”- No live urgency/push-to-finish-faster wording
- One later dead-code variant added extra reminder text, but it pushed for thoroughness, not speed
There was a problem hiding this comment.
Thank you for trying this! It's very interesting.
I do wonder about one or two things though: first, benchmarks like any non-interactive runs, really do stop at max_iterations, they have to, so that it doesn't go on forever. But interactive runs don't have to stop, they only need to pause: so the user via an UI can say something, and it will continue. That might mean that the user experience could be affected if the LLM is told that it'd better hurry, instead of running normally.
So idk, I think maybe it's worth asking: do we want this applied only to non-interactive runs? do we know we are in a non-interactive run?
Secondly, I'm a bit curious. I'd like to know what in-context reminders tell the LLM in other agentic tools. I seem to recall off-hand that OpenAI had a reminder since a long time ago at least.
Reminding it of fragments of the system prompt or instructions, like telling it again how the process works, is fine IMHO; telling the state of its runtime, like number of iterations remaining is fine; idk, WDYT, is hurrying it or nudging it to completion fine?
|
Hi @enyst I agree that nudging it to completion can be removed. |
Co-authored-by: openhands <openhands@all-hands.dev>
Remove nudge to wrap up task from budget warning message. The warning now only informs the agent about remaining steps without suggesting they should complete the task. Co-authored-by: openhands <openhands@all-hands.dev>
|
@VascoSch92 I'd love your thoughts on this discussion. |
| [project] | ||
| name = "openhands-agent-server" | ||
| version = "1.14.0" | ||
| version = "1.15.0" |
There was a problem hiding this comment.
I think the agent did this? I don't think it's the right fix, only release PRs should bump versions. I think updating from main branch will resolve the issue that apparently needed this
|
After the discussion in the relevant issue, I think we don't want to have a reminder for the iteration limit. At least, not now. However, I think it would be interesting (at least for me, for subagents) to have a new error type, i.e., MAX_ITERATIONS_REACHED, because right now this type of error falls under the generic ERROR category, which makes it indistinguishable. Can we just make this change in the PR?
I have already used context budget pressure on agents, but it is difficult to say whether it was effective or not. I think it is effective when you have a very limited number of iterations, but that is actually not our case, at least so far, I have never seen a task fail due to an iteration overflow in a benchmark using the OpenHands agent. |
|
@VascoSch92 @enyst just to clarify
Is that correct? |
#2406
Summary
This change allows conversations to resume after hitting max iterations when a new user message is sent, instead of permanently stopping.
Checklist