Consume the typed RendererConfig surface#2635
Conversation
Reshapes the orchestrator / SFT renderer plumbing around ``renderers.RendererConfig`` (the typed discriminated union) instead of flat strings + booleans. - ``configs.shared.RendererConfig`` now composes ``renderers.RendererConfig`` as a single ``settings`` field; ``pool_size`` stays as a separate prime-rl-side concern. The flat ``name`` / ``tool_parser`` / ``reasoning_parser`` / ``preserve_*`` fields are gone. - ``setup_inference_pool`` / ``setup_clients`` / ``StaticInferencePool`` / ``ElasticInferencePool`` collapse their six-arg renderer surface to a single ``renderer_config: RendererConfig | None`` plus the orthogonal ``renderer_model_name`` / ``renderer_pool_size`` fields. - Orchestrator + SFT trainer call sites pass the typed config straight through to ``create_renderer(tokenizer, config.renderer.settings)``. - Cross-env / unmapped-model validators rewritten to inspect ``renderer.settings`` (discriminated-union shape) instead of the flat field names. - Tests updated to construct typed configs (``Qwen3VLRendererConfig()``, etc.) and assert against the new call shapes. Pins ``deps/renderers`` and the ``prime-rl-configs`` ``renderers`` dep to the renderers PR's tip (``4c9099debbc9bc4fcc3eb6a118bbfa5c364ec395``) until it merges and a dev release ships. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The typed-config refactor landed on renderers main as ``c86c50b`` (tag ``renderers-v0.1.8.dev27``). Switching: - ``deps/renderers`` submodule pointer → ``c86c50b`` - ``prime-rl-configs`` git URL pin → ``@renderers-v0.1.8.dev27`` Same commit, more readable, and stable to anyone reading the diff after the feature branch is deleted. PyPI dev release for the same version is pending a publisher-config fix; once it lands the git URL pin can collapse to a plain ``renderers>=0.1.8.dev27`` PyPI pin.
The new RendererConfig accepts only `settings` (typed `renderers.RendererConfig` discriminated union) and `pool_size`, so flat `name`/`reasoning_parser`/`preserve_*` keys under `[orchestrator.renderer]` now fail pydantic validation. Move those keys under `[orchestrator.renderer.settings]`. `pool_size` stays in the outer block. Two configs that only set `preserve_all_thinking` gain an explicit `name = "auto"` since the discriminator is required. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per Mika's review: drop prime-rl's RendererConfig wrapper that nested the renderers' discriminated union under a ``.settings`` field. The wrapper bought a parallel ``pool_size`` attribute but cost a clunky two-level TOML layout and made every call site dereference ``.settings``. Use ``renderers.RendererConfig`` (the typed discriminated union) as the field type directly. Pool size moves up one level to ``orchestrator.renderer_pool_size``. ``use_renderer: bool`` is gone — ``renderer: RendererConfig | None`` with ``None`` for MITO replaces it (consistent with how SFT, optimizer, and other optional sub-configs express absence). TOML callers spell MITO with ``renderer = "None"``, which ``prime_pydantic_config.BaseConfig._none_str_to_none`` coerces back to Python None. Net effect on TOMLs is a wash with the previous commit: 26 in-repo configs lose the ``.settings`` indirection they gained yesterday, and one (multimodal) splits ``pool_size`` out of the renderer block. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The slim ``prime-rl-configs`` wheel (installed standalone, outside the workspace) now resolves ``renderers`` from PyPI instead of the git tag URL it used while the dev release was waiting on PyPI trusted-publisher config. The full ``prime-rl`` workspace still pulls renderers from the ``deps/renderers`` submodule, so this only affects downstream slim consumers. Re-locks to refresh the lockfile against the same workspace state. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The slim ``prime-rl-configs`` install path (configs only, no GPU
deps) asserts that importing the config classes doesn't pull in
``transformers``. ``from renderers import RendererConfig`` at the top
of ``orchestrator.py`` / ``sft.py`` broke that: ``renderers/__init__``
eagerly imports every concrete renderer module (DefaultRenderer,
Qwen3Renderer, …), each of which imports
``transformers.tokenization_utils.PreTrainedTokenizer``.
Defer the renderers import via:
- ``TYPE_CHECKING``-guarded ``from renderers import RendererConfig``
so static type checkers still see the typed discriminated union.
- ``renderer: Any`` field annotation at runtime, so no resolution
happens at class definition.
- ``model_validator(mode="before") _resolve_renderer`` that
lazy-imports ``renderers``, fills the orchestrator default
(``AutoRendererConfig()``), and coerces dict input through the
typed union via ``TypeAdapter``. SFT's default stays ``None``.
End-user semantics unchanged; class import is now transformers-free.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
dev28 lazy-loads concrete renderer classes via PEP 562 ``__getattr__`` (renderers#64), so ``from renderers import RendererConfig`` no longer drags ``transformers`` into ``sys.modules``. The slim-install CI gate is satisfied without the prior workaround. Drop the ``Any`` annotation and the ``_resolve_renderer`` model-validator that lazy-resolved the discriminated union from ``OrchestratorConfig`` and ``SFTConfig``. The fields are typed ``RendererConfig | None`` again, and pydantic does the full discriminator validation at class definition. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- deps/renderers c86c50b -> 2ec28a8 — picks up the PEP 562 ``__getattr__`` so the workspace install path also lazy-loads concrete renderer classes. - deps/verifiers 04651be -> d107442 — adds the typed ``ClientConfig.renderer_config`` field that ``setup_clients`` forwards. Fixes the unit-test ``test_setup_clients_assigns_renderer_and_dp_rank_headers`` AttributeError on ``renderer_config``. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two cleanups on the typed renderer surface: 1. Per Samsja: drop the ``RendererConfig as RendererSettings`` alias in ``client.py`` / ``elastic.py``. The alias was a workaround for the former local ``RendererConfig`` wrapper class; with that gone, we just import ``RendererConfig`` from renderers directly. 2. Per Cursor bugbot: ``OrchestratorConfig.renderer = None`` (MITO) was silently dropped by ``model_dump(exclude_none=True)``, so any saved ``control/orch.toml`` would reload as the default ``AutoRendererConfig()`` (TITO). Add a wrapping ``model_serializer`` that re-injects ``renderer = "None"`` (string) when the field is ``None``, matching what ``BaseConfig._none_str_to_none`` accepts on reload. MITO and TITO now round-trip through TOML. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 2072afe. Configure here.
|
|
||
| use_renderer: bool = True | ||
| """Use the renderer-backed TITO client (client-side tokenization via the ``renderers`` package, served by ``/v1/generate``). When True, the ``[orchestrator.renderer]`` block (name / tool_parser / reasoning_parser / pool_size) applies. Default for both text-only and VLM rollouts; VLMs require it. False falls back to MITO (``openai_chat_completions``).""" | ||
|
|
There was a problem hiding this comment.
Missing breaking config changelog
Medium Severity
This PR removes orchestrator.use_renderer, moves renderer pool sizing to orchestrator.renderer_pool_size, and replaces the flat [orchestrator.renderer] schema with the typed renderers.RendererConfig union (MITO via renderer = "None"). Those are breaking config changes, but CHANGELOG.md was not updated with a migration entry.
Triggered by project rule: BugBot Instructions
Reviewed by Cursor Bugbot for commit 2072afe. Configure here.
| renderer: RendererConfig = RendererConfig() | ||
| """Client-side renderer configuration. Only consumed when ``use_renderer=true``.""" | ||
| renderer: RendererConfig | None = Field(default_factory=AutoRendererConfig) | ||
| """Typed renderer config (``renderers.RendererConfig`` discriminated |
There was a problem hiding this comment.
can we make this docstring a bit more concise
| @model_serializer(mode="wrap") | ||
| def _preserve_mito_renderer(self, handler: SerializerFunctionWrapHandler) -> dict[str, Any]: | ||
| """Emit ``renderer = "None"`` (string) when MITO so | ||
| ``model_dump(exclude_none=True)`` round-trips: dumped TOML has | ||
| ``renderer = "None"``, and on reload | ||
| ``BaseConfig._none_str_to_none`` coerces it back to ``None``. | ||
| Without this, a MITO orchestrator config saved to | ||
| ``control/orch.toml`` would lose the renderer key entirely and | ||
| reload as the default ``AutoRendererConfig()`` (TITO).""" | ||
| result = handler(self) | ||
| if self.renderer is None: | ||
| result["renderer"] = "None" | ||
| return result |
There was a problem hiding this comment.
hmm this is code smell from the pydantic-config side. can merge for now but can you put an issue to fix this one
…cstring - renderer field uses direct AutoRendererConfig() default (frozen, no factory needed) - renderer_pool_size -> pool_size on OrchestratorConfig; cascade through client.py, elastic.py, orchestrator.py call sites and the test; vf.ClientConfig kwarg name (verifiers-side) stays renderer_pool_size - shorter docstring on the renderer field Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mikasenghaas
left a comment
There was a problem hiding this comment.
as discussed, approving this to unblock quickly. short-term cleanup would be to expose resolve_renderer(model_name: str) -> RendererConfig from renderers so we can resolve 1. the renderer 2. the chat template kwargs in the prime-rl configs
…onfig # Conflicts: # docs/multimodal.md


Summary
Switches the orchestrator and SFT trainer to consume the typed
renderers.RendererConfigdiscriminated union (PrimeIntellect-ai/renderers#60)and the matching
vf.ClientConfig.renderer_configfield(PrimeIntellect-ai/verifiers#1467) — both now merged.
OrchestratorConfig.renderer: RendererConfig | None— a typed discriminated union (defaultAutoRendererConfig),Noneopts into MITO. prime-rl's own duplicateRendererConfiginconfigs.sharedis removed; therendererspackage is the single source of truth.[orchestrator.renderer], dispatched on thenamediscriminator. No.settingsnesting (dropped after review).renderer_pool_size→pool_size, moved to the orchestrator root.setup_inference_pool/setup_clients/Static+ElasticInferencePoolcollapse the multi-arg renderer surface to a singlerenderer_configarg.model_serializerwritesrenderer = "None"so MITO configs round-trip through the savedorch.toml.TOML before / after
Before:
After (typed; or omit the block entirely for
AutoRendererConfig):MITO mode:
Bogus combinations (e.g.
add_vision_idundername = "qwen3") raisepydantic.ValidationErrorat config-load time instead of silently no-op'ing.Dependencies (all landed)
ClientConfig.renderer_config) — mergeddeps/verifiersbumped tof9c68eb2(verifiers 0.1.15.dev11), which also includes PoC: add Qwen3 VL support #1463 (drop silent tool_call status filter) and Bump verifiers #1468 (pipechat_template_kwargsinto the typed renderer config)chat_template_kwargsviasampling.extra_body)Merged latest
main(docs revamp #2602 + wandb-metrics-by-role #2642); resolved thedocs/multimodal.mdmodify/delete conflict and fixed a staleuse_renderer = truereference indocs/training.md.Validation
configs/*.tomlload cleanly against the newRLConfigschema; grep confirms zero stale shapes (.settings,use_renderer,renderer_pool_size) remain in-tree.renderer = "None"→cfg.renderer is None→model_dump→renderer = "None"→ reload →None.[renderer]block →AutoRendererConfig→Qwen36Renderer): 32 rollouts / 686 assistant turns, reward 0.64, renderer route exclusively (0 chat-completions).enable_thinking=falsepropagated correctly — 0 turns with reasoning content.pytest tests/unit/orchestrator/test_orchestrator_setup.py tests/unit/test_configs.py— pass (the one failing case is an untracked local example config using the old shape, not in this PR).Note
Medium Risk
Changes how rollout tokenization is selected (TITO vs MITO) and config validation for VLMs and auto-renderer resolution; misconfigured TOML could fail at load time or pick the wrong client until upstream renderers/verifiers deps align.
Overview
Replaces the
orchestrator.use_rendererboolean withorchestrator.renderer: a typedrenderers.RendererConfigdiscriminated union (defaultAutoRendererConfig) when using the renderer/TITO path, orNonefor MITO. The duplicate flatRendererConfiginconfigs.sharedis removed in favor of therendererspackage;create_rendererand verifiers clients now take a singlerenderer_configinstead of scattered **name/ parsers / preserve_*` fields.orchestrator.pool_sizemoves to the orchestrator root (no longer under[orchestrator.renderer]). Validators and docs are updated accordingly (VLM requires a non-Nonerenderer, SFT forcesrenderer=None,pool_sizeonly when a renderer is set). A custom serializer writesrenderer = "None"so MITO configs round-trip through savedorch.toml. Sample configs, multimodal docs,prime-rl-configsrendererspin, and unit tests follow the new shape.Reviewed by Cursor Bugbot for commit 310a2fa. Bugbot is set up for automated code reviews on this repo. Configure here.