Force EOS at max_tokens with importance-weight correction by ClementeP · Pull Request #145 · genlm/genlm-control

ClementeP · 2026-05-22T18:44:09Z

SequenceModel previously truncated sequences without an EOS token when hitting max_tokens, producing samples that did not target the length-conditioned distribution. At the boundary step we now swap the proposal for a point mass on EOS and apply the corresponding IS correction via a new TokenSampler.logw_eos method, so particles are properly weighted with respect to the target conditioned on |y| <= max_tokens; every returned sequence ends with EOS.

SetTokenSampler overrides logw_eos to use complete(ctx) - prefix(ctx), avoiding materializing the full logw_next vector that set samplers are designed to skip during regular sampling.

Tests updated to reflect the new boundary semantics and a focused prefix-overlap test added to cover a case the existing property test was not exercising.

SequenceModel previously truncated sequences without an EOS token when hitting max_tokens, producing samples that did not target the length-conditioned distribution. At the boundary step we now swap the proposal for a point mass on EOS and apply the corresponding IS correction via a new TokenSampler.logw_eos method, so particles are properly weighted with respect to the target conditioned on |y| <= max_tokens; every returned sequence ends with EOS. SetTokenSampler overrides logw_eos to use complete(ctx) - prefix(ctx), avoiding materializing the full logw_next vector that set samplers are designed to skip during regular sampling. Tests updated to reflect the new boundary semantics and a focused prefix-overlap test added to cover a case the existing property test was not exercising. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

codecov · 2026-05-22T18:46:05Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

ClementeP requested review from samuki and vicky-xef May 22, 2026 18:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Force EOS at max_tokens with importance-weight correction#145

Force EOS at max_tokens with importance-weight correction#145
ClementeP wants to merge 1 commit into
mainfrom
clemente-max-tokens-to-eos

ClementeP commented May 22, 2026

Uh oh!

codecov Bot commented May 22, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ClementeP commented May 22, 2026

Uh oh!

codecov Bot commented May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov Bot commented May 22, 2026 •

edited

Loading