fix(mistralai): enforce max_concurrent_requests #34142

SSOBHY2 · 2025-11-29T01:21:52Z

This pull request makes the existing max_concurrent_requests setting for the Mistral integration actually control how many HTTP requests can be in flight at once, both for chat models and for embeddings.

For both ChatMistralAI and MistralAIEmbeddings, the code that builds the HTTPX clients now creates an httpx.Limits object using max_concurrent_requests. That limits object sets both the maximum number of simultaneous connections and the maximum number of keep-alive connections to the value of max_concurrent_requests. These limits are then passed into httpx.Client and httpx.AsyncClient when they are created. In practice, that means even if user code tries to send many requests at once, the underlying HTTP connection pool will only open up to max_concurrent_requests active connections to the Mistral API.

For embeddings, there was an additional issue on the async side: the aembed_documents method uses asyncio.gather over batches, which can easily create many concurrent HTTP calls. To make sure max_concurrent_requests is respected at the application level as well (not only at the connection-pool level), the embeddings class now has a private asyncio.Semaphore attribute. This semaphore is initialized to max_concurrent_requests and is used to wrap each async embedding batch call. When aembed_documents runs, each batch is sent through a helper function that first acquires the semaphore, then performs the HTTP POST to the /embeddings endpoint, and releases the semaphore when it is done. This guarantees that at most max_concurrent_requests embedding batches are being processed concurrently, even if aembed_documents is used in a very concurrent way.

The retry logic for requests (using tenacity) is still in place and works the same as before. The semaphore and HTTPX limits are layered on top of that, so you still get retries for transient errors, but now with real, enforced concurrency limits.

The scope line “Scope: libs/partners/mistralai” means that all code changes and new tests are limited to the Mistral partner package. No other packages in the monorepo are touched. This keeps the change focused and aligns with the guideline that PRs should not affect multiple packages unless necessary.

The “Breaking changes: None” line is important. Public APIs such as class names, constructor parameters, and method signatures are unchanged. The max_concurrent_requests parameter already existed and was documented; the PR just makes it actually control concurrency as advertised. From a user’s perspective, code that worked before still compiles and runs the same way. The only behavioral difference is that concurrency is now correctly bounded to the value they configured, which is the expected and safer behavior.

The tests line “uv run --group test pytest libs/partners/mistralai/tests/unit_tests -q” describes how the new and existing unit tests for this package are run. That command uses the project’s uv-based environment to execute pytest only for the Mistral partner tests. Inside those tests, there are new checks that:

Confirm that httpx.Client and httpx.AsyncClient receive an httpx.Limits instance whose max_connections and max_keepalive_connections match max_concurrent_requests.
Confirm that aembed_documents actually respects the concurrency bound. This is done by patching the async HTTP POST method with a fake function that counts how many requests are “active” at once and ensuring that this number never exceeds the configured limit (for example, max_concurrent_requests set to 1).

Those tests all pass, so we know that both the HTTPX client limits and the semaphore-based concurrency control behave as intended.

AI disclaimer: This contribution was developed with the assistance of an AI coding assistant. I reviewed and double-checked all code and tests myself, and the AI was only used to assist in developing the changes, not to make final decisions.

- Wire max_concurrent_requests into HTTPX client limits for Mistral chat and embeddings - Bound concurrent aembed_documents calls with a semaphore - Add unit tests for HTTPX limits and async concurrency

codspeed-hq · 2025-11-29T01:23:44Z

CodSpeed Performance Report

Merging #34142 will not alter performance

_{Comparing SSOBHY2:changes (8df6aa7) with master (b7091d3)¹}

Summary

✅ 1 untouched
⏩ 33 skipped²

No successful run was found on master (5a7cf87) during the generation of this report, so b7091d3 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report. ↩
33 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

SSOBHY2 added 3 commits November 28, 2025 20:06

fix(partners-mistralai): enforce max_concurrent_requests

f3dceb0

- Wire max_concurrent_requests into HTTPX client limits for Mistral chat and embeddings - Bound concurrent aembed_documents calls with a semaphore - Add unit tests for HTTPX limits and async concurrency

fix issues

e763544

more linter issues fixed

72a85c0

SSOBHY2 requested review from ccurme and mdrxy as code owners November 29, 2025 01:21

github-actions bot added fix integration Related to a provider partner package integration mistralai and removed fix labels Nov 29, 2025

github-actions bot added the fix label Nov 29, 2025

SSOBHY2 added 2 commits December 1, 2025 15:18

Merge branch 'master' into changes

c29a9d5

Merge branch 'master' into changes

8df6aa7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(mistralai): enforce max_concurrent_requests #34142

fix(mistralai): enforce max_concurrent_requests #34142

SSOBHY2 commented Nov 29, 2025 •

edited

Loading

Uh oh!

codspeed-hq bot commented Nov 29, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fix(mistralai): enforce max_concurrent_requests #34142

Are you sure you want to change the base?

fix(mistralai): enforce max_concurrent_requests #34142

Conversation

SSOBHY2 commented Nov 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codspeed-hq bot commented Nov 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Performance Report

Merging #34142 will not alter performance

Summary

Footnotes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

SSOBHY2 commented Nov 29, 2025 •

edited

Loading

codspeed-hq bot commented Nov 29, 2025 •

edited

Loading