Skip to content

Commit 16a2dd5

Browse files
authored
[model] fix qwen25vl config (#929)
### PR Category <!-- One of [ Train | Inference | Compress | Serve | RL | Core | Hardware | CICD | Tools | Others ] --> Serve ### PR Types <!-- One of [ User Experience | New Features | Bug Fixes | Improvements | Performance | Breaking Change| Deprecations | Test Case | Docs | Others ] --> Bug Fixes ### PR Description <!-- Describe what you’ve done --> fix config about qwen2.5vl
1 parent 5d55464 commit 16a2dd5

File tree

1 file changed

+1
-2
lines changed

1 file changed

+1
-2
lines changed

examples/qwen2_5_vl/conf/serve/32b_instruct.yaml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,7 @@
88
pipeline_parallel_size: 1
99
max_num_seqs: 8 # Even at full 32,768 context usage, 8 concurrent operations won't trigger OOM
1010
gpu_memory_utilization: 0.9
11-
limit_mm_per_prompt: image=18 # should be customized, 18 images/request is enough for most scenarios
11+
limit_mm_per_prompt: '{"image": 18}' # should be customized, 18 images/request is enough for most scenarios
1212
port: 9010
1313
trust_remote_code: true
14-
enforce_eager: true # better compare to FlagGems
1514
enable_chunked_prefill: true

0 commit comments

Comments
 (0)