-
Notifications
You must be signed in to change notification settings - Fork 891
[tests] add qwen2_5_vl batch_infer test #5975
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[tests] add qwen2_5_vl batch_infer test #5975
Conversation
Summary of ChangesHello @Jintao-Huang, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the testing framework by replacing a single-instance test for the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds a new batch inference test for qwen2_5_vl
. The new test is more comprehensive than the one it replaces. My review focuses on improving the readability and maintainability of the new test code by refactoring duplicated data and hardcoded values.
def test_qwen2_5_vl_batch_infer(): | ||
from qwen_vl_utils import process_vision_info | ||
pt_engine = PtEngine('Qwen/Qwen2.5-VL-7B-Instruct', max_batch_size=2) | ||
request_config = RequestConfig(max_tokens=128, temperature=0) | ||
resp = pt_engine.infer([{ | ||
'messages': [{ | ||
'role': 'user', | ||
'content': '<image>What kind of dog is this?' | ||
}], | ||
'images': ['https://qianwen-res.oss-accelerate-overseas.aliyuncs.com/Qwen2-VL/demo_small.jpg'] | ||
}, { | ||
'messages': [{ | ||
'role': 'user', | ||
'content': '<video>describe the video.' | ||
}], | ||
'videos': ['https://modelscope-open.oss-cn-hangzhou.aliyuncs.com/images/baby.mp4'] | ||
}], | ||
request_config=request_config) | ||
response_list = [resp[0].choices[0].message.content, resp[1].choices[0].message.content] | ||
model = pt_engine.model | ||
template = pt_engine.default_template | ||
processor = template.processor | ||
messages1 = [{ | ||
'role': | ||
'user', | ||
'content': [ | ||
{ | ||
'type': 'image', | ||
'image': 'https://qianwen-res.oss-accelerate-overseas.aliyuncs.com/Qwen2-VL/demo_small.jpg' | ||
}, | ||
{ | ||
'type': 'text', | ||
'text': 'What kind of dog is this?' | ||
}, | ||
], | ||
}] | ||
messages2 = [{ | ||
'role': | ||
'user', | ||
'content': [ | ||
{ | ||
'type': 'video', | ||
'video': 'https://modelscope-open.oss-cn-hangzhou.aliyuncs.com/images/baby.mp4' | ||
}, | ||
{ | ||
'type': 'text', | ||
'text': 'describe the video.' | ||
}, | ||
], | ||
}] | ||
messages = [messages1, messages2] | ||
|
||
texts = [processor.apply_chat_template(msg, tokenize=False, add_generation_prompt=True) for msg in messages] | ||
image_inputs, video_inputs = process_vision_info(messages) | ||
inputs = processor( | ||
text=texts, | ||
images=image_inputs, | ||
videos=video_inputs, | ||
padding=True, | ||
return_tensors='pt', | ||
padding_side='left', | ||
) | ||
inputs = inputs.to('cuda') | ||
|
||
# Batch Inference | ||
generated_ids = model.generate(**inputs, max_new_tokens=128) | ||
generated_ids_trimmed = [out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)] | ||
output_texts = processor.batch_decode( | ||
generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False) | ||
assert output_texts == response_list |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test function test_qwen2_5_vl_batch_infer
can be refactored to improve readability and reduce code duplication.
- Hardcoded URLs and queries are used multiple times. They can be extracted into constants.
- The construction of
infer_data
forpt_engine.infer
andmessages
for manual inference is repetitive and can be simplified by defining the test data once and reusing it.
Applying these changes will make the test easier to understand and maintain.
def test_qwen2_5_vl_batch_infer():
from qwen_vl_utils import process_vision_info
pt_engine = PtEngine('Qwen/Qwen2.5-VL-7B-Instruct', max_batch_size=2)
request_config = RequestConfig(max_tokens=128, temperature=0)
image_url = 'https://qianwen-res.oss-accelerate-overseas.aliyuncs.com/Qwen2-VL/demo_small.jpg'
video_url = 'https://modelscope-open.oss-cn-hangzhou.aliyuncs.com/images/baby.mp4'
image_query = 'What kind of dog is this?'
video_query = 'describe the video.'
infer_data = [{
'messages': [{
'role': 'user',
'content': f'<image>{image_query}'
}],
'images': [image_url]
}, {
'messages': [{
'role': 'user',
'content': f'<video>{video_query}'
}],
'videos': [video_url]
}]
resp = pt_engine.infer(infer_data, request_config=request_config)
response_list = [r.choices[0].message.content for r in resp]
model = pt_engine.model
template = pt_engine.default_template
processor = template.processor
manual_inference_data = [
{'type': 'image', 'media': image_url, 'text': image_query},
{'type': 'video', 'media': video_url, 'text': video_query},
]
messages = [[{
'role': 'user',
'content': [
{'type': item['type'], item['type']: item['media']},
{'type': 'text', 'text': item['text']},
],
}] for item in manual_inference_data]
texts = [processor.apply_chat_template(msg, tokenize=False, add_generation_prompt=True) for msg in messages]
image_inputs, video_inputs = process_vision_info(messages)
inputs = processor(
text=texts,
images=image_inputs,
videos=video_inputs,
padding=True,
return_tensors='pt',
padding_side='left',
)
inputs = inputs.to('cuda')
# Batch Inference
generated_ids = model.generate(**inputs, max_new_tokens=128)
generated_ids_trimmed = [out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)]
output_texts = processor.batch_decode(
generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False)
assert output_texts == response_list
No description provided.