-
Notifications
You must be signed in to change notification settings - Fork 17
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Is your feature request related to a problem? Please describe:
Currently matrix inference only support passing text instead of passing prompt embeds. However, this is supported by vllm https://docs.vllm.ai/en/stable/features/prompt_embeds.html
Describe the solution you would like:
Allowing users to pass prompt_embeds instead of prompts as the payload
Describe the alternatives you have considered:
vllm engines deployed locally can be used, but they do not scale.
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request