Skip to content

Qwen35Renderer: missing dispatch in renderers.client._build_mm_features for vLLM multimodal payload #39

@eligotts

Description

@eligotts

Summary

renderers.client._build_mm_features (renderers/client.py:202) raises NotImplementedError for Qwen35Renderer, even though Qwen35Renderer is a MultimodalRenderer and ships the same pixel_values + image_grid_thw payload shape as Qwen3VLRenderer.

Reproduced while dogfooding the prime-rl multimodal renderer path (prime-rl PR #2473) with Qwen/Qwen3.5-2B on the color-codeword env: every rollout fails immediately with:

NotImplementedError: Multimodal serialization not implemented for Qwen35Renderer.
Add a dispatch branch in renderers.client._build_mm_features.

Repro

from transformers import AutoTokenizer
from renderers import create_renderer, is_multimodal
tok = AutoTokenizer.from_pretrained("Qwen/Qwen3.5-2B", trust_remote_code=True)
r = create_renderer(tok, renderer="auto")
print(type(r).__name__)        # Qwen35Renderer
print(is_multimodal(r))        # True
# ... any /inference/v1/generate call that goes through renderers.client.generate
# with multi_modal_data populated raises NotImplementedError.

Suggested fix

Current code (renderers/client.py:236):

from renderers.qwen3_vl import Qwen3VLRenderer
if issubclass(renderer_cls, Qwen3VLRenderer):
    return _build_qwen_vl_features(mm_data, spatial_merge_size=2)
raise NotImplementedError(...)

renderers/qwen35.py:325–328 packs pixel_values and image_grid_thw into mm_items["image"] — identical to Qwen3VLRenderer. Reading merge_size from proc.image_processor.merge_size (renderers/qwen35.py:187) confirms the Qwen2-VL family field factory should apply. The fix looks like:

from renderers.qwen35 import Qwen35Renderer
if issubclass(renderer_cls, (Qwen3VLRenderer, Qwen35Renderer)):
    return _build_qwen_vl_features(mm_data, spatial_merge_size=2)

One caveat to verify: spatial_merge_size is hardcoded to 2 in _build_qwen_vl_features. Qwen3.5's image processor exposes merge_size per-model; if any Qwen3.5 size ships merge_size != 2, the helper should read it from mm_items metadata rather than hardcoding.

Versions

  • renderers==0.1.8.dev2
  • verifiers==0.1.15.dev7 (verifiers main, post #1395)
  • prime-rl PR #2473 branch feat/multimodal-renderer-pr
  • vLLM 0.20.2

Workaround used during dogfood

Pivoted the test run to Qwen/Qwen3-VL-4B-Instruct (PR #2473's tested model), which routes through the Qwen3VLRenderer dispatch and works.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions