Skip to content

Fix MinicpmV model converter and clip to avoid using hardcode. #14750

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

gryffindor-rr
Copy link

Make sure to read the contributing guidelines before submitting a PR
Tested by llama-mtmd-cli with multiple minicpmv models.

@github-actions github-actions bot added examples python python script changes labels Jul 18, 2025
@ngxson ngxson self-requested a review July 18, 2025 09:22
Copy link
Collaborator

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please create PR from a non-master branch next time, otherwise I cannot push my corrections directly to this PR

@gryffindor-rr
Copy link
Author

Please create PR from a non-master branch next time, otherwise I cannot push my corrections directly to this PR

sure. what branch will you suggest to use? shall I create a new one say 'minicpmv'?

@gryffindor-rr gryffindor-rr requested a review from ngxson July 28, 2025 05:51
@github-actions github-actions bot added the documentation Improvements or additions to documentation label Jul 29, 2025
@CISC
Copy link
Collaborator

CISC commented Jul 31, 2025

This needs to be updated for MiniCPM-V 4.0 that was just merged.

@gryffindor-rr
Copy link
Author

This needs to be updated for MiniCPM-V 4.0 that was just merged.

done.
convert models by following instructions listed in https://github.com/ggml-org/llama.cpp/blob/master/docs/multimodal/minicpmv4.0.md

example output:
./build/bin/llama-mtmd-cli -m ~/models/MiniCPM-V-4/model/Model-3.6B-F16.gguf --mmproj ~/models/MiniCPM-V-4/mmproj-model-f16.gguf -c 4096 --temp 0.7 --top-p 0.8 --top-k 100 --repeat-penalty 1.05 --image ~/codeup/edge_infer_sdk/tests/data/cat.png -p 'Describe image' -ngl 99
main: loading model: /cache/zhanglei/models/MiniCPM-V-4/model/Model-3.6B-F16.gguf
encoding image slice...
image slice encoded in 99 ms
decoding image batch 1/1, n_tokens_batch = 64
image decoded (batch 1/1) in 8 ms

The image features a close-up of a cat. The feline has distinct tabby markings on its fur, characterized by the distinctive stripes and swirls typical of the tabby pattern found in many domestic cats.
The background is blurred but appears to be an indoor setting with neutral colors that do not detract from the subject.
There are no visible texts or additional objects within this frame. The overall composition focuses solely on capturing the detailed features and calm demeanor of the cat as it gazes slightly upwards.

llama_perf_context_print: load time = 1823.19 ms
llama_perf_context_print: prompt eval time = 429.45 ms / 77 tokens ( 5.58 ms per token, 179.30 tokens per second)
llama_perf_context_print: eval time = 960.62 ms / 110 runs ( 8.73 ms per token, 114.51 tokens per second)
llama_perf_context_print: total time = 1852.51 ms / 187 tokens
llama_perf_context_print: graphs reused = 106

@gryffindor-rr gryffindor-rr requested a review from CISC August 7, 2025 09:15
@CISC
Copy link
Collaborator

CISC commented Aug 7, 2025

Can merge when @ngxson approves.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation examples python python script changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants