feat: renderer self-describes prefix-stability for SFT / RL / inference consumers

## Motivation

A renderer's output for `render(messages[:k])` vs `render(messages[:k+1])` can differ in non-obvious ways depending on the template's history-handling semantics. Today this knowledge is implicit — each renderer's `emit_*` helpers encode it, but consumers (SFT dataloaders, RL trajectory builders, inference incremental rendering) have to either hard-code per-template assumptions or assume the worst case.

Two concrete consumer problems this causes:

1. **SFT loses intermediate reasoning on non-prefix-stable templates.** `build_training_sample` does a single full-conversation render. For Qwen3/GLM5 defaults, that single render strips reasoning from every assistant turn except the last (history-stripping). The model is trained to produce reasoning only on the final turn. Workarounds today: set `preserve_all_thinking=True` (changes inference semantics too) or use `build_incremental_token_mask` (errors out on non-prefix-stable templates — `IncrementalTokenizationError`). Neither is great.
2. **Inference bridges can't short-circuit.** `RendererClient._get_incremental_prompt_ids` always calls `bridge_to_next_turn`, even when the renderer is fully prefix-stable and a naive concat would be correct. The bridge does real work it doesn't need to.

The renderer is the source of truth for these semantics — it has to be, because it implements the emit logic. Today it doesn't expose that truth.

## Proposed API

A declarative property describing which append-boundaries preserve the rendered prefix:

```python
# renderers/base.py
from dataclasses import dataclass
from typing import Literal

Boundary = Literal["tool", "user", "system", "developer"]

@dataclass(frozen=True)
class RenderStability:
    """Which append-boundaries preserve the rendered prefix.

    ``boundary in preserves_through`` means: for any messages M and a single
    appended message m where ``m["role"] == boundary``, ``render(M).token_ids``
    is a prefix of ``render(M + [m]).token_ids``. The bridge for that boundary
    is then a trivial "append new tokens" — no history transformation.

    Boundaries *not* in the set may still be appendable (via ``bridge_to_next_turn``),
    but the renderer transforms earlier tokens (e.g. strips reasoning from a prior
    assistant when a new user turn arrives).
    """
    preserves_through: frozenset[Boundary]

    @property
    def fully_stable(self) -> bool:
        return self.preserves_through >= {"tool", "user", "system", "developer"}


class Renderer:
    @property
    def stability(self) -> RenderStability: ...
```

## Per-renderer values

| Renderer | `preserves_through` | Notes |
|---|---|---|
| Qwen3 default | `{"tool"}` | Stable within a tool cycle; user boundary strips assistant reasoning. |
| Qwen3 + `preserve_thinking_between_tool_calls=True` | `{"tool"}` | Same — that flag only preserves reasoning *within* the current cycle, which is the cycle's default for emit. |
| Qwen3 + `preserve_all_thinking=True` | `{"tool", "user", "system", "developer"}` | Reasoning preserved across every boundary. |
| GLM5 / variants | analogous to Qwen3 | |
| Kimi K2, K2.5 | TBD by template emit logic | |
| DeepSeek V3, Nemotron 3, Laguna XS.2 | TBD | |
| GPT-OSS | TBD (Harmony format) | |
| `DefaultRenderer` (Jinja) | `frozenset()` | Opaque template, assume nothing. |

Stability is dynamic w.r.t. construction-time flags (`preserve_all_thinking`, `preserve_thinking_between_tool_calls`): the same renderer class can advertise different `stability` depending on init args. Cache key into shared pools already includes these flags (`verifiers/clients/renderer_client.py:486-494`).

## Consumer impact

### SFT (in `prime-rl`)

`build_training_sample` queries `renderer.stability`:

- `fully_stable` → current single-render path is correct.
- `"tool" in preserves_through` but not `"user"` (Qwen3/GLM5 default) → split conversation at user boundaries; for each assistant-terminated segment, produce a separate training sample. Each captures that turn's reasoning under inference-correct history-stripping. Dataloader fans out 1 conversation → N samples.
- empty (`DefaultRenderer`) → fall back to per-assistant render. Slowest, always correct.

The fan-out shifts batch sizing semantics: "batch of 32 conversations" becomes "batch of `sum(stages_per_conversation)` samples." Worth deciding upfront whether to expand at the dataloader level or pack N segments into one sample with attention isolation.

### RL (verifiers / prime-rl orchestrator)

`bridge_to_next_turn` stays the canonical correctness path. When `renderer.stability.fully_stable`, the caller can skip the bridge and do a pure concat — an optimization, not a correctness change. Existing code keeps working.

### Inference

Same — `RendererClient._get_incremental_prompt_ids` can branch on `stability` and skip the bridge dispatch for fully-stable renderers.

## Open questions

1. **Include `"developer"`?** GPT-OSS already handles a `developer` role (`gpt_oss.py:442`) for Harmony / Responses-API messages. Including it keeps GPT-OSS first-class without consumer-side special-casing. Templates that don't use developer messages just never see them — the flag is moot for them.
2. **Per-boundary bridge declarations?** Could extend to `RenderStability.bridge_required: frozenset[Boundary]` so consumers know *which* bridges do non-trivial work vs. trivial-append. Today every renderer has one `bridge_to_next_turn`; this would be a separate signal.
3. **Tools change as a boundary?** Today `tools` is passed alongside `messages`. If the tool list changes between renders (e.g. tool added mid-trajectory), the system-section content changes and prefix-stability breaks regardless of message role. Should the API model that too? (Probably yes, as a separate `stable_under_tools_change: bool`.)

## Implementation sketch

- Renderers side: add `stability` property to each renderer class. Most are one-liners returning a fixed `RenderStability`. Flag-aware ones (Qwen3, GLM5) compute it from their `_preserve_*_thinking` attributes.
- New module: `renderers.stability` exporting `Boundary`, `RenderStability`, helper constants for common cases (e.g. `FULLY_STABLE`, `STABLE_IN_TOOL_CYCLE`).
- Tests: parametrize the existing fixtures from `tests/test_sampled_mask.py` to also assert that `render(M + [m])` extends `render(M)` exactly when `m["role"] in stability.preserves_through`. Catches drift between declared and actual behavior.

## Related

- PrimeIntellect-ai/renderers#33 — `sampled_mask` AND in `build_training_sample` (closes the narrower "trains on `<|im_start|>assistant\n` scaffolding" issue)
- PrimeIntellect-ai/renderers#38 — per-message / per-role analytics; also lands `message_indices`/`sampled_mask`/`message_roles` on bridge output
- PrimeIntellect-ai/prime-rl#2436 — exposes `preserve_*_thinking` flags through prime-rl's RendererConfig
- PrimeIntellect-ai/prime-rl#2492 — narrower SFT scaffolding bug, fixed by #33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: renderer self-describes prefix-stability for SFT / RL / inference consumers #41

Motivation

Proposed API

Per-renderer values

Consumer impact

SFT (in `prime-rl`)

RL (verifiers / prime-rl orchestrator)

Inference

Open questions

Implementation sketch

Related

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Renderer	`preserves_through`	Notes
Qwen3 default	`{"tool"}`	Stable within a tool cycle; user boundary strips assistant reasoning.
Qwen3 + `preserve_thinking_between_tool_calls=True`	`{"tool"}`	Same — that flag only preserves reasoning within the current cycle, which is the cycle's default for emit.
Qwen3 + `preserve_all_thinking=True`	`{"tool", "user", "system", "developer"}`	Reasoning preserved across every boundary.
GLM5 / variants	analogous to Qwen3
Kimi K2, K2.5	TBD by template emit logic
DeepSeek V3, Nemotron 3, Laguna XS.2	TBD
GPT-OSS	TBD (Harmony format)
`DefaultRenderer` (Jinja)	`frozenset()`	Opaque template, assume nothing.

feat: renderer self-describes prefix-stability for SFT / RL / inference consumers #41

Description

Motivation

Proposed API

Per-renderer values

Consumer impact

SFT (in prime-rl)

RL (verifiers / prime-rl orchestrator)

Inference

Open questions

Implementation sketch

Related

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

SFT (in `prime-rl`)