Fix MinicpmV model converter and clip to avoid using hardcode. #14750

gryffindor-rr · 2025-07-18T06:24:43Z

Make sure to read the contributing guidelines before submitting a PR
Tested by llama-mtmd-cli with multiple minicpmv models.

ngxson

Please create PR from a non-master branch next time, otherwise I cannot push my corrections directly to this PR

tools/mtmd/clip-impl.h

tools/mtmd/clip.cpp

gryffindor-rr · 2025-07-28T03:30:28Z

Please create PR from a non-master branch next time, otherwise I cannot push my corrections directly to this PR

sure. what branch will you suggest to use? shall I create a new one say 'minicpmv'?

tools/mtmd/clip-impl.h

tools/mtmd/clip.cpp

tools/mtmd/legacy-models/minicpmv-convert-image-encoder-to-gguf.py

CISC · 2025-07-31T15:23:16Z

This needs to be updated for MiniCPM-V 4.0 that was just merged.

gryffindor-rr · 2025-08-07T08:23:48Z

This needs to be updated for MiniCPM-V 4.0 that was just merged.

done.
convert models by following instructions listed in https://github.com/ggml-org/llama.cpp/blob/master/docs/multimodal/minicpmv4.0.md

example output:
./build/bin/llama-mtmd-cli -m ~/models/MiniCPM-V-4/model/Model-3.6B-F16.gguf --mmproj ~/models/MiniCPM-V-4/mmproj-model-f16.gguf -c 4096 --temp 0.7 --top-p 0.8 --top-k 100 --repeat-penalty 1.05 --image ~/codeup/edge_infer_sdk/tests/data/cat.png -p 'Describe image' -ngl 99
main: loading model: /cache/zhanglei/models/MiniCPM-V-4/model/Model-3.6B-F16.gguf
encoding image slice...
image slice encoded in 99 ms
decoding image batch 1/1, n_tokens_batch = 64
image decoded (batch 1/1) in 8 ms

The image features a close-up of a cat. The feline has distinct tabby markings on its fur, characterized by the distinctive stripes and swirls typical of the tabby pattern found in many domestic cats.
The background is blurred but appears to be an indoor setting with neutral colors that do not detract from the subject.
There are no visible texts or additional objects within this frame. The overall composition focuses solely on capturing the detailed features and calm demeanor of the cat as it gazes slightly upwards.

llama_perf_context_print: load time = 1823.19 ms
llama_perf_context_print: prompt eval time = 429.45 ms / 77 tokens ( 5.58 ms per token, 179.30 tokens per second)
llama_perf_context_print: eval time = 960.62 ms / 110 runs ( 8.73 ms per token, 114.51 tokens per second)
llama_perf_context_print: total time = 1852.51 ms / 187 tokens
llama_perf_context_print: graphs reused = 106

CISC · 2025-08-07T09:27:43Z

Can merge when @ngxson approves.

CISC · 2025-08-11T10:01:51Z

@ngxson gentle ping

Fix MinicpmV model converter and clip to avoid using hardcode.

a8935c9

github-actions bot added examples python python script changes labels Jul 18, 2025

ngxson self-requested a review July 18, 2025 09:22

ngxson requested changes Jul 24, 2025

View reviewed changes

tools/mtmd/clip-impl.h Outdated Show resolved Hide resolved

tools/mtmd/clip.cpp Outdated Show resolved Hide resolved

tools/mtmd/clip.cpp Outdated Show resolved Hide resolved

tools/mtmd/clip.cpp Outdated Show resolved Hide resolved

Code update for pr/14750

0fec64b

gryffindor-rr requested a review from ngxson July 28, 2025 05:51

CISC reviewed Jul 28, 2025

View reviewed changes

Remove unused field, update script path in docs.

d432b21

github-actions bot added the documentation Improvements or additions to documentation label Jul 29, 2025

lzhang added 2 commits August 7, 2025 16:17

Merge remote-tracking branch 'upstream/master'

4c558eb

Add version 5 for fallback code.

8d0a87d

gryffindor-rr requested a review from CISC August 7, 2025 09:15

CISC approved these changes Aug 7, 2025

View reviewed changes

ngxson approved these changes Aug 11, 2025

View reviewed changes

ngxson merged commit cf9e564 into ggml-org:master Aug 11, 2025
49 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix MinicpmV model converter and clip to avoid using hardcode. #14750

Fix MinicpmV model converter and clip to avoid using hardcode. #14750

Uh oh!

gryffindor-rr commented Jul 18, 2025

Uh oh!

ngxson left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gryffindor-rr commented Jul 28, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

CISC commented Jul 31, 2025

Uh oh!

gryffindor-rr commented Aug 7, 2025

Uh oh!

CISC commented Aug 7, 2025

Uh oh!

CISC commented Aug 11, 2025

Uh oh!

Uh oh!

Uh oh!

Fix MinicpmV model converter and clip to avoid using hardcode. #14750

Fix MinicpmV model converter and clip to avoid using hardcode. #14750

Uh oh!

Conversation

gryffindor-rr commented Jul 18, 2025

Uh oh!

ngxson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gryffindor-rr commented Jul 28, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

CISC commented Jul 31, 2025

Uh oh!

gryffindor-rr commented Aug 7, 2025

Uh oh!

CISC commented Aug 7, 2025

Uh oh!

CISC commented Aug 11, 2025

Uh oh!

Uh oh!

Uh oh!