-
Notifications
You must be signed in to change notification settings - Fork 56
Closed
Milestone
Description
From aakankshaduggal#8
Overview
The current implementation of LLMBlock sends the entire dataset as a single large batch of requests to the OpenAI server. This may lead to some requests waiting too long for the response, resulting in timeout errors and potentially overloading the backend server with extremely large batches.
Proposed Changes
Use concurrent processing using Python’s concurrent.futures package in LLMBlock
. The key changes are:
- Utilizes concurrent.futures for managing parallel tasks with threading for launching parallel tasks.
- Allows users to specify the number of requests to send in each batch.
- Allows users to specify the number of concurrent worker threads to handle batches.
Example Usage
If the user sets the concurrency to 8 and the batch size to 32, the system will run 8 concurrent threads, each sending a batch of 32 prompts, resulting in a total of 256 requests processed simultaneously by the backend server.
Metadata
Metadata
Assignees
Labels
No labels