Add early-finalize continuation for truncated reasoning rollouts by taivu1998 · Pull Request #475 · rllm-org/rllm

taivu1998 · 2026-04-02T22:53:10Z

Summary

This PR adds opt-in support for early-finalizing truncated long-form generations so we can reserve answer budget inside a single completion window.

Closes #267.

What changed

add a workflow-level early-finalize helper that:
- reserves a configurable tail budget from max_tokens
- runs an initial generation with the reduced budget
- if that generation stops due to length, optionally appends a synthetic suffix only when reasoning was cut off inside an unfinished <think>...</think> block
- continues generation from token input for the reserved tail budget
add token-in/token-out continuation support to the rollout stack and timing layer for the Verl path
wire the new helper into the standard workflow implementations and the FinQA workflow
add Step.response_mask so synthetic suffix tokens remain in completion_ids for continuity while being excluded from Verl loss
update the Verl transform to preserve the explicit step-level response mask instead of always assuming an all-ones loss mask
add config knobs for rllm.early_finalize
add focused tests for early-finalize behavior and response-mask propagation

Design notes

the feature is opt-in and defaults to disabled
v1 stays tightly scoped to the workflow/Verl path that the issue targets
synthetic answer forcing is narrow by design: for non-thinking truncated outputs we simply continue from the partial completion without injecting a prefix
existing prompt-length guards remain unchanged; this only handles the case where a single completion runs out of response budget

Testing

python -m pytest tests/engine/test_early_finalize.py tests/unified_trainer/test_verl_transform.py tests/rewards/test_math_reward.py -q
ruff check rllm/engine/rollout/rollout_engine.py rllm/engine/rollout/verl_engine.py rllm/workflows/early_finalize.py rllm/workflows/timing_mixin.py rllm/workflows/single_turn_workflow.py rllm/workflows/multi_turn_workflow.py rllm/workflows/cumulative_workflow.py rllm/agents/agent.py rllm/experimental/verl/transform.py rllm/experimental/verl/__init__.py projects/finqa/train_finqa.py tests/engine/test_early_finalize.py tests/unified_trainer/test_verl_transform.py
python -m py_compile rllm/engine/rollout/rollout_engine.py rllm/engine/rollout/verl_engine.py rllm/workflows/early_finalize.py rllm/workflows/timing_mixin.py rllm/workflows/single_turn_workflow.py rllm/workflows/multi_turn_workflow.py rllm/workflows/cumulative_workflow.py rllm/agents/agent.py rllm/experimental/verl/transform.py rllm/experimental/verl/__init__.py projects/finqa/train_finqa.py

kylemontgomery1 · 2026-04-04T00:40:36Z

I lean towards this being implemented at the workflow level (e.g., SingleTurnWorkflowWithEarlyFinalize) instead of a global feature in rLLM. @listar2000 What do you think?

listar2000 · 2026-04-04T02:42:12Z

Will take a look thx @kylemontgomery1 for pointing this to me.

taivu1998 · 2026-04-19T16:04:33Z

Hi @kylemontgomery1, @listar2000, could you help review again? Thanks!

taivu1998 added 2 commits April 19, 2026 08:55

Add early-finalize continuation for truncated rollouts

c46d0d0

Refactor early finalize into opt-in workflows

1f5ce35

taivu1998 force-pushed the tdv/issue-267-early-finalize branch from cc26d63 to 1f5ce35 Compare April 19, 2026 15:56

Fix pre-commit formatting in verl transform

aa568a1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add early-finalize continuation for truncated reasoning rollouts#475

Add early-finalize continuation for truncated reasoning rollouts#475
taivu1998 wants to merge 3 commits into
rllm-org:mainfrom
taivu1998:tdv/issue-267-early-finalize

taivu1998 commented Apr 2, 2026

Uh oh!

kylemontgomery1 commented Apr 4, 2026

Uh oh!

listar2000 commented Apr 4, 2026

Uh oh!

taivu1998 commented Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

taivu1998 commented Apr 2, 2026

Summary

What changed

Design notes

Testing

Uh oh!

kylemontgomery1 commented Apr 4, 2026

Uh oh!

listar2000 commented Apr 4, 2026

Uh oh!

taivu1998 commented Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants