Skip to content

feat(map): add Qwen3-30B-A3B-{Instruct,Thinking}-2507, GLM-5-FP8#50

Merged
hallerite merged 1 commit into
mainfrom
feat/map-pi-mirrors-and-thinking-variants
May 18, 2026
Merged

feat(map): add Qwen3-30B-A3B-{Instruct,Thinking}-2507, GLM-5-FP8#50
hallerite merged 1 commit into
mainfrom
feat/map-pi-mirrors-and-thinking-variants

Conversation

@hallerite
Copy link
Copy Markdown
Member

@hallerite hallerite commented May 18, 2026

Summary

Adds three entries to MODEL_RENDERER_MAP. Each has a chat_template.jinja that is byte-identical (md5-confirmed) to a model already in the map, so they can route to the same hand-coded renderer instead of falling back to DefaultRenderer:

New entry Routes to Identical to
Qwen/Qwen3-30B-A3B-Instruct-2507 qwen3 Qwen/Qwen3-4B-Instruct-2507
Qwen/Qwen3-30B-A3B-Thinking-2507 qwen3 Qwen/Qwen3-4B-Thinking-2507
zai-org/GLM-5-FP8 glm-5 zai-org/GLM-5

Why

Unblocks four prime-rl example configs (multinode, qwen30b_math, qwen30b_swe, glm5_pd_disag) that were silently falling back to DefaultRenderer because their model.name wasn't an exact match in the map. With the new config-time validator in prime-rl (PrimeIntellect-ai/prime-rl#2540), these would otherwise need explicit [orchestrator.renderer] overrides — and adding to the map is the right answer because the templates are confirmed identical.

Policy: no PrimeIntellect/* mirrors in the map

An earlier revision also added PrimeIntellect/MiniMax-M2.5-bf16 → minimax-m2. That entry has been removed.

PrimeIntellect/* re-uploads of upstream models mostly exist to carry chat-template patches. The renderers in this map ignore those patches — every hand-coded renderer applies its own hardcoded template, not the model's. So adding PI mirrors to the map gives them no benefit, and inviting them in makes the map a moving target as PI ships and retires patched mirrors.

Prime-rl configs should reference the upstream model id directly (e.g. Qwen/Qwen3-0.6B, not PrimeIntellect/Qwen3-0.6B) so auto-resolution picks the right hand-coded renderer. The rare exception — PrimeIntellect/MiniMax-M2.5-bf16 is needed because we have to train in bf16 and upstream ships fp8 — keeps the PI id but sets renderer.name = "minimax-m2" explicitly in the prime-rl config.

Verification

For each candidate, fetched chat_template.jinja (or extracted chat_template from tokenizer_config.json) from the HF repo and md5-hashed against the already-mapped sibling:

5795f12e815e  Qwen/Qwen3-4B-Instruct-2507/tokenizer_config.json chat_template (2630 B)
5795f12e815e  Qwen/Qwen3-30B-A3B-Instruct-2507/tokenizer_config.json chat_template (2630 B)

(4168 B)  Qwen/Qwen3-4B-Thinking-2507/tokenizer_config.json chat_template
(4168 B)  Qwen/Qwen3-30B-A3B-Thinking-2507/tokenizer_config.json chat_template

c39fa970531b98cb889b16714f8c5512  zai-org/GLM-5/chat_template.jinja
c39fa970531b98cb889b16714f8c5512  zai-org/GLM-5-FP8/chat_template.jinja

test_modality_registry_models_route_to_renderer passes locally.

Note

Add Qwen3-30B-A3B-Instruct-2507, Qwen3-30B-A3B-Thinking-2507, and GLM-5-FP8 to renderer map

Adds three new model ID entries to MODEL_RENDERER_MAP in renderers/base.py, routing Qwen/Qwen3-30B-A3B-Instruct-2507 and Qwen/Qwen3-30B-A3B-Thinking-2507 to the qwen3 renderer and zai-org/GLM-5-FP8 to the glm-5 renderer instead of falling back to the default renderer.

Macroscope summarized b3d3f2e.

hallerite added a commit to PrimeIntellect-ai/prime-rl that referenced this pull request May 18, 2026
Closes #2537.

When `use_renderer=True` with `renderer.name='auto'` and `model.name`
isn't in `MODEL_RENDERER_MAP`, `create_renderer` silently falls back
to `DefaultRenderer`. That fallback (a) doesn't fix the
position-dependent chat-template bug the renderer client exists to
solve, and (b) rejects envs that pass tools (rollout dies with
"RendererPool does not support tools") unless `renderer.tool_parser`
is set. Today this only surfaces mid-rollout.

Add a config-time `@model_validator(mode="after")` on
`OrchestratorConfig` that rejects this combination at parse time, so
`--dry-run` reports it. Lazy-imports `MODEL_RENDERER_MAP` so the slim
`prime-rl-configs` package still parses configs when `renderers`
isn't installed.

Sweep 25 existing configs to opt into a renderer explicitly:
- 20 PI-vendored / fine-tuned / R1-distilled configs get
  `[orchestrator.renderer] name = "default"` (the right choice — their
  templates are customized, so `apply_chat_template` is correct;
  forcing a model-specific renderer would emit canonical tokens that
  don't match the vendored template). R1 distills also get
  `reasoning_parser = "think"`.
- 5 configs whose model is template-identical (md5-confirmed) to an
  already-mapped sibling get the model-specific renderer name
  explicitly: GLM-5-FP8 → `glm-5`, Qwen3-30B-A3B-Thinking-2507 →
  `qwen3`, PrimeIntellect/MiniMax-M2.5-bf16 → `minimax-m2`.

Those three models will also be added to MODEL_RENDERER_MAP upstream
(PrimeIntellect-ai/renderers#50). Once that lands and the submodule
bumps, the explicit names in those 5 configs become redundant and
can be removed in a follow-up — but the PR here is self-contained
and doesn't depend on the renderers PR landing first.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@hallerite hallerite marked this pull request as ready for review May 18, 2026 13:45
hallerite added a commit to PrimeIntellect-ai/prime-rl that referenced this pull request May 18, 2026
Closes #2537.

When `use_renderer=True` with `renderer.name='auto'` and `model.name`
isn't in `MODEL_RENDERER_MAP`, `create_renderer` silently falls back
to `DefaultRenderer`. That fallback (a) doesn't fix the
position-dependent chat-template bug the renderer client exists to
solve, and (b) rejects envs that pass tools (rollout dies with
"RendererPool does not support tools") unless `renderer.tool_parser`
is set. Today this only surfaces mid-rollout.

Add a config-time `@model_validator(mode="after")` on
`OrchestratorConfig` that rejects this combination at parse time, so
`--dry-run` reports it. Lazy-imports `MODEL_RENDERER_MAP` so the slim
`prime-rl-configs` package still parses configs when `renderers`
isn't installed.

Sweep 25 existing configs to opt into a renderer explicitly:
- 20 PI-vendored / fine-tuned / R1-distilled configs get
  `[orchestrator.renderer] name = "default"` (the right choice — their
  templates are customized, so `apply_chat_template` is correct;
  forcing a model-specific renderer would emit canonical tokens that
  don't match the vendored template). R1 distills also get
  `reasoning_parser = "think"`.
- 5 configs whose model is template-identical (md5-confirmed) to an
  already-mapped sibling get the model-specific renderer name
  explicitly: GLM-5-FP8 → `glm-5`, Qwen3-30B-A3B-Thinking-2507 →
  `qwen3`, PrimeIntellect/MiniMax-M2.5-bf16 → `minimax-m2`.

Those three models will also be added to MODEL_RENDERER_MAP upstream
(PrimeIntellect-ai/renderers#50). Once that lands and the submodule
bumps, the explicit names in those 5 configs become redundant and
can be removed in a follow-up — but the PR here is self-contained
and doesn't depend on the renderers PR landing first.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
mikasenghaas
mikasenghaas previously approved these changes May 18, 2026
@hallerite hallerite force-pushed the feat/map-pi-mirrors-and-thinking-variants branch from 68f05b8 to 796ba57 Compare May 18, 2026 17:14
@hallerite hallerite changed the title feat(map): add Qwen3-30B-Thinking-2507, GLM-5-FP8, MiniMax-M2.5-bf16 feat(map): add Qwen3-30B-Thinking-2507, GLM-5-FP8 May 18, 2026
@hallerite hallerite requested a review from mikasenghaas May 18, 2026 17:30
@hallerite hallerite force-pushed the feat/map-pi-mirrors-and-thinking-variants branch from 796ba57 to d9156cd Compare May 18, 2026 17:33
macroscopeapp[bot]
macroscopeapp Bot previously approved these changes May 18, 2026
@macroscopeapp
Copy link
Copy Markdown

macroscopeapp Bot commented May 18, 2026

Approvability

Verdict: Approved

Purely additive changes to MODEL_RENDERER_MAP, mapping three new model identifiers to existing renderers. No new logic or behavioral changes - just extends existing configuration patterns.

You can customize Macroscope's approvability policy. Learn more.

Comment thread renderers/base.py
All three have md5-identical chat_template.jinja to a model already in
MODEL_RENDERER_MAP, so they can route to the same renderer:

- Qwen/Qwen3-30B-A3B-Instruct-2507 -> qwen3 (== Qwen3-4B-Instruct-2507)
- Qwen/Qwen3-30B-A3B-Thinking-2507 -> qwen3 (== Qwen3-4B-Thinking-2507)
- zai-org/GLM-5-FP8 -> glm-5 (FP8 quant of GLM-5, same template)

Policy: do not register PrimeIntellect/* mirrors in MODEL_RENDERER_MAP.
PI repos mostly exist to carry chat-template patches, which renderers
ignore anyway (they apply their own hardcoded templates). Prime-rl
configs should point at the upstream model id to get auto-resolution;
the rare exceptions (e.g. PrimeIntellect/MiniMax-M2.5-bf16 needed for
bf16 dtype training) can set renderer.name explicitly.
@hallerite hallerite force-pushed the feat/map-pi-mirrors-and-thinking-variants branch from d9156cd to b3d3f2e Compare May 18, 2026 17:40
@hallerite hallerite changed the title feat(map): add Qwen3-30B-Thinking-2507, GLM-5-FP8 feat(map): add Qwen3-30B-A3B-{Instruct,Thinking}-2507, GLM-5-FP8 May 18, 2026
@hallerite hallerite merged commit 8704f9d into main May 18, 2026
11 checks passed
@hallerite hallerite deleted the feat/map-pi-mirrors-and-thinking-variants branch May 18, 2026 17:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants