feat(vision): increase max vision resolution to support qwen3-vl high resolution image processing capability. #1835

alifurkanstahl · 2025-11-05T22:51:17Z

Updates vision max resolution from 2048 to 32768 to match Qwen3-VL's tokenizer capabilities. Qwen3-VL's tokenizer can handle much higher resolution images, need to raise the artificial limit to allow full model functionality.

…nizer limits Updates vision max resolution from 2048 to 32768 to match Qwen3-VL's tokenizer capabilities. Qwen3-VL's tokenizer can handle much higher resolution images, so we need to raise the artificial limit to allow full model functionality.

LostRuins · 2025-11-06T06:34:05Z

I don't think this will work, even though the model might have been trained on higher resolution images, the backend cannot handle such large images. Moreover I have found very few tasks that cannot be done with a 2048x2048 image, allowing this can risk crashing the server. Have you tested with this size and do you have a use case for it?

Update visionmaxres parameter limits and clamp logic to support higher resolution vision processing for models like qwen3-vl. This affects the MMProj vision processing pipeline and GUI configuration options.

alifurkanstahl · 2025-11-07T00:47:29Z

The 32K limit isn’t meant to support full 32K×32K images. It’s mainly to avoid unnecessary resizing for extremely wide or narrow ones, such as 9k ×1k.

KoboldCpp also seems to miscalculate image embedding sizes when using encode_image_with_clip. For example, a 120×159 image gives around 24 tokens in LM Studio (llama.cpp backend) but about 824 in KoboldCpp, so the image encoding logic might need review.

This difference is more noticeable for OCR or text-dense images that require fine detail. The 32K value is just an example, and the actual maximum could be set lower, such as 4K or 8K, without losing flexibility.

feat(vision): increase maximum vision resolution from 2048 to 4096

7f97410

Update visionmaxres parameter limits and clamp logic to support higher resolution vision processing for models like qwen3-vl. This affects the MMProj vision processing pipeline and GUI configuration options.

Update to 4096

b2d1968

alifurkanstahl marked this pull request as draft November 7, 2025 01:12

LostRuins added the KIV for now Some issues prevent this from being merged label Nov 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(vision): increase max vision resolution to support qwen3-vl high resolution image processing capability. #1835

feat(vision): increase max vision resolution to support qwen3-vl high resolution image processing capability. #1835

alifurkanstahl commented Nov 5, 2025 •

edited

Loading

Uh oh!

LostRuins commented Nov 6, 2025

Uh oh!

alifurkanstahl commented Nov 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(vision): increase max vision resolution to support qwen3-vl high resolution image processing capability. #1835

Are you sure you want to change the base?

feat(vision): increase max vision resolution to support qwen3-vl high resolution image processing capability. #1835

Conversation

alifurkanstahl commented Nov 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

LostRuins commented Nov 6, 2025

Uh oh!

alifurkanstahl commented Nov 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

alifurkanstahl commented Nov 5, 2025 •

edited

Loading