-
-
Notifications
You must be signed in to change notification settings - Fork 319
Closed
Description
Seeing as this is being built from the ground up, I was wondering if its possible to implement something similar to ggml-org/llama.cpp#3228
Where it's natively possible to have parallel inference.
acidbubbles and lopuhin
Metadata
Metadata
Assignees
Labels
No labels