Skip to content

[Epic] Improve concurrency in LLMBlock #135

@markmc

Description

@markmc

From aakankshaduggal#8

Overview

The current implementation of LLMBlock sends the entire dataset as a single large batch of requests to the OpenAI server. This may lead to some requests waiting too long for the response, resulting in timeout errors and potentially overloading the backend server with extremely large batches.

Proposed Changes

Use concurrent processing using Python’s concurrent.futures package in LLMBlock. The key changes are:

  1. Utilizes concurrent.futures for managing parallel tasks with threading for launching parallel tasks.
  2. Allows users to specify the number of requests to send in each batch.
  3. Allows users to specify the number of concurrent worker threads to handle batches.

Example Usage

If the user sets the concurrency to 8 and the batch size to 32, the system will run 8 concurrent threads, each sending a batch of 32 prompts, resulting in a total of 256 requests processed simultaneously by the backend server.


Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions