-
Notifications
You must be signed in to change notification settings - Fork 46
Llama4 VLM Continuous Batching Support #510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Mohit Soni <[email protected]>
Signed-off-by: Mohit Soni <[email protected]>
"chunk_length": prefill_seq_len, | ||
"chunk_ctx_len": chunk_ctx_len, | ||
} | ||
if continuous_batching: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Can we update together for prefill and decode on line 985? Do we need 2 separate if conditions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup not needed, I will do both in one if condition.
else: | ||
lang_decode["batch_size"] = kv_cache_batch_size | ||
|
||
lang = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: better to use lang_specialization = [lang_prefill, lang_decode] for better readability
@@ -969,18 +998,22 @@ def get_specializations( | |||
specializations["lang"] = lang | |||
return specializations, compiler_options | |||
else: | |||
lang[0].pop("vision_size") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better to use a loop and do the pop operation rather than performing it for each index
@@ -637,6 +639,9 @@ def export( | |||
export_dir, | |||
) | |||
|
|||
import ipdb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove
please add tests for CB and update the table in validated model list with CB support |
No description provided.