feat(base): per-token is_content mask for body/scaffold attribution#53
Open
snimu wants to merge 4 commits into
Open
feat(base): per-token is_content mask for body/scaffold attribution#53snimu wants to merge 4 commits into
is_content mask for body/scaffold attribution#53snimu wants to merge 4 commits into
Conversation
Generalizes sampled_mask across all roles. is_content[k] is True iff
token k came from message-body bytes — caller-provided content /
tool_calls / reasoning_content, or the model's sampled emission for
assistant — and False iff template scaffolding (role tags, closers
when not sampled, inter-turn separators, tool-response wraps,
tools-header block, generation prompt). By construction is_content ==
sampled_mask over every assistant-attributed token; carries new
information on every other role where sampled_mask is uniformly False.
Enables SFT on tool response bodies while applying RL only to
assistant tokens — build_training_sample(..., content_sft_roles={"tool"})
trains the model to anticipate tool outputs without learning to emit
the surrounding <|tool_response>/role-tag scaffold (which would
interrupt a real rollout).
New on RenderedTokens:
- is_content: list[bool] field (empty when the renderer opts out, same
policy as sampled_mask)
- content_token_spans_by_role()
- content_mask_for_roles(roles)
New module-level helpers in base.py:
- attribute_text_segments(tokenizer, segments) — single-BPE-pass
attribution via offset_mapping; auto-loads a vanilla offset-capable
tokenizer when the supplied one doesn't track offsets (fastokens
patch), cached process-globally per model name.
- build_training_sample(..., content_sft_roles=...) — opt-in body-only
supervision for roles the model never samples. Falls back to the
prior role_to_mask + sampled_mask behaviour when is_content is
empty.
Wired through every hand-coded renderer: qwen3, qwen3.5, qwen3.6
(inherits), qwen3-vl, glm5, glm5.1, glm4.5, kimi-k2, kimi-k2.5/2.6,
minimax-m2, deepseek-v3, nemotron-3, laguna-xs.2, gpt-oss. Concatenated
wrap+body emits go through emit_text_segments (or per-renderer
equivalents) so BPE merges at the boundary stay byte-identical with the
prior single-emit path. Renderers whose tokenizer doesn't support
offset_mapping (Kimi, MiniMax with its known fastokens edge case) use
boundary-aware emit patterns or a per-renderer overlap rule to keep
body bytes recoverable.
Multimodal placeholders (<|image_pad|>, <|media_pad|>) are body
(is_content=True) — they represent caller-provided image data in
token form. The surrounding vision/media wrap specials are scaffold.
Fixed along the way:
- nemotron3: off-by-one in message_roles when a default system was
auto-injected. Now indexes the caller-provided message list.
- kimi_k2: same off-by-one fixed via a caller_messages snapshot.
Tests: 10 invariants × 17-model matrix in tests/test_is_content.py.
Token IDs stay byte-identical vs apply_chat_template across every
renderer; existing test_render_ids / test_sampled_mask / test_bridge /
test_build_helpers / test_tokens_per_message / test_roundtrip /
test_multimodal stay green. DefaultRenderer leaves is_content empty
(Jinja is opaque), same policy as sampled_mask.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Output of `uv run ruff format` after wiring is_content through the renderers in the previous commit. No semantic changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ApprovabilityVerdict: Needs human review New feature adding per-token body/scaffold attribution ( You can customize Macroscope's approvability policy. Learn more. |
Author
|
Hi! How can I help? I can:
Just let me know what you need! |
Author
|
please run a correctness check |
Author
|
@macroscope-app please run a correctness check |
|
Correctness check is running. Results will be posted when complete. Previously completed check runs: |
…y_role content_token_spans_by_role only checked that is_content matched token_ids length, but it then walks spans returned by message_token_spans() — whose values are indices into message_indices. When len(message_indices) > len(token_ids), the span end can exceed len(is_content), and the inner loop dereferences self.is_content[k] past the end, raising IndexError. The sister method content_mask_for_roles already checks both lengths (introduced in the same commit). Match the precedent. Not reachable through the renderer pipeline today — every hand-coded renderer populates the four parallel lists in lock-step — but the dataclass doesn't enforce the invariant, so a manually-constructed RenderedTokens (e.g. a test fixture) could trip it. The guard is cheap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
generate() already calls renderer.render() internally when the caller
doesn't pre-supply prompt_ids, producing a RenderedTokens that carries
token_ids, message_indices, sampled_mask, is_content, message_roles,
and multi_modal_data. Previously we surfaced only token_ids and
multi_modal_data and dropped the rest at the function boundary.
Callers that wanted per-token attribution downstream (verifiers'
RendererClient → prime-rl, for SFT-on-tool-body / selective loss
masking) had no way to recover it without a second render pass.
Two surfaces:
- New return field ``prompt_attribution``: the full RenderedTokens for
the prompt — either the one this call computed via render() or the
one the caller threaded in alongside prompt_ids. Downstream consumers
call e.g. ``content_mask_for_roles({"tool"})`` on it to build the
SFT-on-tool-body mask without re-rendering.
- New optional parameter ``prompt_attribution``: callers that pre-built
prompt_ids (the multi-turn bridge path in verifiers) can hand in the
RenderedTokens that bridge_to_next_turn returned, and it surfaces on
the result unchanged.
Mirrors the existing multi_modal_data plumbing — same shape, same
None-default-when-unknown semantics. No behavioural change for callers
that don't read the new field.
Tests:
- Existing parse-and-build test now asserts prompt_attribution carries
every populated RenderedTokens field through verbatim.
- New test for the pre-built-prompt-with-attribution path (the bridge
shape verifiers will use).
- New assertion that pre-built-prompt-without-attribution returns
``None`` so callers can detect the gap.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Author
|
@macroscope-app please run a correctness check again |
|
Correctness check complete. No issues found. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
is_content: list[bool]toRenderedTokens— a per-token signal that generalisessampled_maskacross all roles:Trueiff the token came from message-body bytes (caller-providedcontent/tool_calls/reasoning_content, or the model's sampled emission for assistant),Falseiff template scaffolding (role tags, closers when not sampled, separators, tool-response wraps, tools-header block, generation prompt).build_training_sample(..., content_sft_roles={"tool"})so a single render produces a loss mask that combines RL on assistant tokens with SFT on tool response bodies — without supervising the surrounding<|tool_response>/ role-tag specials that would interrupt a real rollout.apply_chat_template.renderers.client.generate()returns the renderer's per-token attribution asprompt_attribution: RenderedTokens, so downstream consumers (verifiersRendererClient→ prime-rl) carry the body/scaffold cut to the trainer without re-rendering.Motivation
For RL the policy loss applies only to tokens the model emitted. A useful auxiliary objective is SFT on tool response bodies — supervise the model to anticipate what tools return, without supervising the wrap. If the model learns to emit
<|tool_response>itself, it can derail a rollout by short-circuiting the harness.sampled_maskanswers "would the model emit this?", which is the right cut for assistant tokens but is uniformlyFalseon non-assistant roles. There is no way to ask "which tokens came from message-body bytes" on tool / user / system messages usingsampled_maskalone.is_contentis that signal. For a tool message wrapped as<|im_start|>user\n<|tool_response>\n{body}\n<|tool_response_end|><|im_end|>\n,is_contentisTrueonly on the{body}tokens — never on the<|tool_response>specials or the inter-section newlines.By construction
is_content == sampled_maskover every assistant-attributed token; on every other rolesampled_maskis uniformlyFalseandis_contentcarries informationsampled_maskcannot.is_contentis a strict superset (or equal) ofsampled_maskeverywhere and never contradicts it.API
On
RenderedTokens:is_content: list[bool]— same length / empty policy assampled_mask. Empty means the renderer opts out (DefaultRenderer leaves it empty for the same reason it leavessampled_maskempty: Jinja is opaque).content_token_spans_by_role() -> dict[str, list[tuple[int, int]]]— contiguous body-only token runs grouped by message role.content_mask_for_roles(roles) -> list[bool]— per-token bool mask,Trueonly on body tokens whose message role is in the supplied set.Module-level in
renderers.base:attribute_text_segments(tokenizer, segments)— single-BPE-pass attribution viaoffset_mapping. When the supplied tokenizer doesn't track offsets (fastokens patch), lazy-loads a vanilla offset-capable tokenizer for the same model and caches it process-globally.build_training_sample(..., content_sft_roles=...)— opt-in body-only supervision for roles the model never samples. Falls back to therole_to_mask + sampled_maskbehaviour whenis_contentis empty.On
renderers.client.generate():prompt_attribution: RenderedTokens | None— the per-token attribution for the prompt, either the one this call computes viarender()internally or the one the caller threaded in alongsideprompt_ids. Downstream consumers callattr.content_mask_for_roles({"tool"})on it to build selective loss masks without re-rendering.prompt_attribution: RenderedTokens | None = None— callers that pre-buildprompt_ids(the multi-turn bridge path in verifiers) hand in theRenderedTokensthatbridge_to_next_turnreturned, and it surfaces on the result unchanged.The new field on
generate()mirrors the existingmulti_modal_datasidecar — same shape, same None-default-when-unknown semantics.How it works
Every renderer has emit sites like
emit_text("user\n" + content, ...)that join wrap text and body text into one BPE pass to preserve token merges at the boundary. Theemit_text_segments(...)helper (defined locally in each renderer) does the same join with per-token attribution:offset_mappingto recover each token's character span.fastokens(the Rust BPE patched in by default for ~10x faster encode) doesn't track offsets.attribute_text_segmentstransparently loads a vanilla offset-capable tokenizer for the same model and caches it process-globally per model name. Most models inMODEL_RENDERER_MAPproduce byte-identical token IDs between fastokens and vanilla, so the mix is safe; models inFASTOKENS_INCOMPATIBLEalready use vanilla everywhere.A few renderers use tokenizers that can't provide offset mapping at all and rely on per-renderer alternatives:
TikTokenTokenizer. Avoids concatenated wrap+body emits to begin with — Kimi's structure splits wrap and body at special-token boundaries, so threadingis_contentthrough the split emits suffices.<response>and the body's first letter under certain tokenizer load orders. A localemit_token_overlap_bodyhelper picks the overlap rule so the body's leading byte stays recoverable from its body run.Per-renderer coverage
qwen3qwen3.5enable_thinkingpolarity preserved.qwen3.6Qwen35Renderer; only overrides a pure string serializer, so it picks upis_contentthrough the parent class.qwen3-vl<|image_pad|>placeholders are body (is_content=True); the surrounding<|vision_start|>/<|vision_end|>are scaffold.glm5/glm5.1GLM5Renderercovers both via subclass. Also coverszai-org/GLM-4.7-Flash.glm4.5<|observation|>/<tool_response>wraps are scaffold; body is content.kimi-k2TikTokenTokenizer— uses existing split-emit boundaries (noattribute_text_segments).kimi-k2.5/kimi-k2.6<|media_pad|>is body;<|media_begin|>...<|media_end|>wrap is scaffold.minimax-m2FASTOKENS_INCOMPATIBLE(vanilla everywhere). Local overlap helper for<response>BPE merge.deepseek-v3FASTOKENS_INCOMPATIBLE(Metaspace pretokenizer). Standard wrap/body split.nemotron-3emit_text_segmentsfor\nboundaries.laguna-xs.2gpt-ossfunctions.{name}text on tool result messages is scaffold (comes from prior assistanttool_calls, not this tool's content).DefaultRendererleavesis_contentempty.Tests
tests/test_is_content.py— 10 invariants × 17-model matrix:token_idsor is empty (opt-out).is_content == sampled_maskover assistant tokens.is_content=False.is_content=Truerun.is_content=False.content_token_spans_by_role()isolates tool body cleanly.content_mask_for_roles({"tool"})excludes assistant.build_training_sample(..., content_sft_roles={"tool"})trains tool body + assistant, never user.tests/test_client.pycovers theprompt_attributionsurface ongenerate():prompt_attributioncarries every populatedRenderedTokensfield through verbatim.prompt_idsandprompt_attribution) passes attribution through unchanged.Noneso callers can detect the gap.Full suite collects 1557 tests — all pass (modulo pre-existing gpt-oss HF-parity skips and one unrelated xfailed).
test_render_idsbyte-identity vsapply_chat_templateis green on every renderer.Additional fixes
nemotron3:message_roleswas sourced from the auto-injected normalised list, off-by-one when a default system was prepended. Now indexes the caller-provided message list.kimi_k2: same off-by-one fixed via acaller_messagessnapshot.Notes for the maintainer
bridge_to_next_turnpopulatesis_contenton the bridge-emitted portion only; the prior portion (previous_prompt_ids + previous_completion_ids) gets[False] * len(previous_ids)per the same conventionsampled_maskfollows on bridge output. Consumers walk the trajectory and read each step's ownis_contentfor full-conversation body masks.Note
Add per-token
is_contentbody/scaffold attribution mask to all renderersis_content: list[bool]field toRenderedTokensin renderers/base.py that marks each token as caller/model body (True) or template scaffolding (False).attribute_text_segmentsin renderers/base.py to tokenize concatenated(text, is_content)segments in a single BPE pass using offset mapping, preserving merge boundaries while attributing each token to the correct segment.is_contentpopulation across all renderers (qwen3,qwen35,qwen3_vl,deepseek_v3,gpt_oss,kimi_k2,kimi_k25,laguna_xs2,minimax_m2,nemotron3,glm45,glm5), includingrender,bridge/bridge_to_next_turn, and assistant/tool helpers.build_training_samplewith acontent_sft_rolesparameter that restricts loss to body-only tokens for specified roles usingis_content, leaving behavior unchanged when the field is absent or empty.content_token_spans_by_roleandcontent_mask_for_roleshelpers toRenderedTokensfor downstream span extraction.is_content == sampled_mask;message_rolesin some renderers now reflects the original caller message list rather than the post-normalized list.Macroscope summarized 281d89b.