feat(map): add Qwen3-30B-A3B-{Instruct,Thinking}-2507, GLM-5-FP8#50
Merged
Merged
Conversation
hallerite
added a commit
to PrimeIntellect-ai/prime-rl
that referenced
this pull request
May 18, 2026
Closes #2537. When `use_renderer=True` with `renderer.name='auto'` and `model.name` isn't in `MODEL_RENDERER_MAP`, `create_renderer` silently falls back to `DefaultRenderer`. That fallback (a) doesn't fix the position-dependent chat-template bug the renderer client exists to solve, and (b) rejects envs that pass tools (rollout dies with "RendererPool does not support tools") unless `renderer.tool_parser` is set. Today this only surfaces mid-rollout. Add a config-time `@model_validator(mode="after")` on `OrchestratorConfig` that rejects this combination at parse time, so `--dry-run` reports it. Lazy-imports `MODEL_RENDERER_MAP` so the slim `prime-rl-configs` package still parses configs when `renderers` isn't installed. Sweep 25 existing configs to opt into a renderer explicitly: - 20 PI-vendored / fine-tuned / R1-distilled configs get `[orchestrator.renderer] name = "default"` (the right choice — their templates are customized, so `apply_chat_template` is correct; forcing a model-specific renderer would emit canonical tokens that don't match the vendored template). R1 distills also get `reasoning_parser = "think"`. - 5 configs whose model is template-identical (md5-confirmed) to an already-mapped sibling get the model-specific renderer name explicitly: GLM-5-FP8 → `glm-5`, Qwen3-30B-A3B-Thinking-2507 → `qwen3`, PrimeIntellect/MiniMax-M2.5-bf16 → `minimax-m2`. Those three models will also be added to MODEL_RENDERER_MAP upstream (PrimeIntellect-ai/renderers#50). Once that lands and the submodule bumps, the explicit names in those 5 configs become redundant and can be removed in a follow-up — but the PR here is self-contained and doesn't depend on the renderers PR landing first. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
hallerite
added a commit
to PrimeIntellect-ai/prime-rl
that referenced
this pull request
May 18, 2026
Closes #2537. When `use_renderer=True` with `renderer.name='auto'` and `model.name` isn't in `MODEL_RENDERER_MAP`, `create_renderer` silently falls back to `DefaultRenderer`. That fallback (a) doesn't fix the position-dependent chat-template bug the renderer client exists to solve, and (b) rejects envs that pass tools (rollout dies with "RendererPool does not support tools") unless `renderer.tool_parser` is set. Today this only surfaces mid-rollout. Add a config-time `@model_validator(mode="after")` on `OrchestratorConfig` that rejects this combination at parse time, so `--dry-run` reports it. Lazy-imports `MODEL_RENDERER_MAP` so the slim `prime-rl-configs` package still parses configs when `renderers` isn't installed. Sweep 25 existing configs to opt into a renderer explicitly: - 20 PI-vendored / fine-tuned / R1-distilled configs get `[orchestrator.renderer] name = "default"` (the right choice — their templates are customized, so `apply_chat_template` is correct; forcing a model-specific renderer would emit canonical tokens that don't match the vendored template). R1 distills also get `reasoning_parser = "think"`. - 5 configs whose model is template-identical (md5-confirmed) to an already-mapped sibling get the model-specific renderer name explicitly: GLM-5-FP8 → `glm-5`, Qwen3-30B-A3B-Thinking-2507 → `qwen3`, PrimeIntellect/MiniMax-M2.5-bf16 → `minimax-m2`. Those three models will also be added to MODEL_RENDERER_MAP upstream (PrimeIntellect-ai/renderers#50). Once that lands and the submodule bumps, the explicit names in those 5 configs become redundant and can be removed in a follow-up — but the PR here is self-contained and doesn't depend on the renderers PR landing first. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mikasenghaas
previously approved these changes
May 18, 2026
68f05b8 to
796ba57
Compare
796ba57 to
d9156cd
Compare
ApprovabilityVerdict: Approved Purely additive changes to MODEL_RENDERER_MAP, mapping three new model identifiers to existing renderers. No new logic or behavioral changes - just extends existing configuration patterns. You can customize Macroscope's approvability policy. Learn more. |
All three have md5-identical chat_template.jinja to a model already in MODEL_RENDERER_MAP, so they can route to the same renderer: - Qwen/Qwen3-30B-A3B-Instruct-2507 -> qwen3 (== Qwen3-4B-Instruct-2507) - Qwen/Qwen3-30B-A3B-Thinking-2507 -> qwen3 (== Qwen3-4B-Thinking-2507) - zai-org/GLM-5-FP8 -> glm-5 (FP8 quant of GLM-5, same template) Policy: do not register PrimeIntellect/* mirrors in MODEL_RENDERER_MAP. PI repos mostly exist to carry chat-template patches, which renderers ignore anyway (they apply their own hardcoded templates). Prime-rl configs should point at the upstream model id to get auto-resolution; the rare exceptions (e.g. PrimeIntellect/MiniMax-M2.5-bf16 needed for bf16 dtype training) can set renderer.name explicitly.
d9156cd to
b3d3f2e
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds three entries to
MODEL_RENDERER_MAP. Each has achat_template.jinjathat is byte-identical (md5-confirmed) to a model already in the map, so they can route to the same hand-coded renderer instead of falling back toDefaultRenderer:Qwen/Qwen3-30B-A3B-Instruct-2507qwen3Qwen/Qwen3-4B-Instruct-2507Qwen/Qwen3-30B-A3B-Thinking-2507qwen3Qwen/Qwen3-4B-Thinking-2507zai-org/GLM-5-FP8glm-5zai-org/GLM-5Why
Unblocks four prime-rl example configs (
multinode,qwen30b_math,qwen30b_swe,glm5_pd_disag) that were silently falling back toDefaultRendererbecause theirmodel.namewasn't an exact match in the map. With the new config-time validator in prime-rl (PrimeIntellect-ai/prime-rl#2540), these would otherwise need explicit[orchestrator.renderer]overrides — and adding to the map is the right answer because the templates are confirmed identical.Policy: no
PrimeIntellect/*mirrors in the mapAn earlier revision also added
PrimeIntellect/MiniMax-M2.5-bf16 → minimax-m2. That entry has been removed.PrimeIntellect/*re-uploads of upstream models mostly exist to carry chat-template patches. The renderers in this map ignore those patches — every hand-coded renderer applies its own hardcoded template, not the model's. So adding PI mirrors to the map gives them no benefit, and inviting them in makes the map a moving target as PI ships and retires patched mirrors.Prime-rl configs should reference the upstream model id directly (e.g.
Qwen/Qwen3-0.6B, notPrimeIntellect/Qwen3-0.6B) so auto-resolution picks the right hand-coded renderer. The rare exception —PrimeIntellect/MiniMax-M2.5-bf16is needed because we have to train in bf16 and upstream ships fp8 — keeps the PI id but setsrenderer.name = "minimax-m2"explicitly in the prime-rl config.Verification
For each candidate, fetched
chat_template.jinja(or extractedchat_templatefromtokenizer_config.json) from the HF repo and md5-hashed against the already-mapped sibling:test_modality_registry_models_route_to_rendererpasses locally.Note
Add Qwen3-30B-A3B-Instruct-2507, Qwen3-30B-A3B-Thinking-2507, and GLM-5-FP8 to renderer map
Adds three new model ID entries to
MODEL_RENDERER_MAPin renderers/base.py, routingQwen/Qwen3-30B-A3B-Instruct-2507andQwen/Qwen3-30B-A3B-Thinking-2507to theqwen3renderer andzai-org/GLM-5-FP8to theglm-5renderer instead of falling back to the default renderer.Macroscope summarized b3d3f2e.