Fix goroutine leaks on context cancellation #829

pjcdawkins · 2025-11-14T00:04:26Z

Summary

Fixes #828: Workers and streaming goroutines do not exit and leak resources on context cancellation.

This PR addresses two critical goroutine leak issues:

Worker loops that ignore context cancellation
Streaming handlers that block indefinitely on I/O

Changes

Worker Loop Fix

Refactored requestWorker in core/bifrost.go to monitor bifrost.ctx.Done() using a select statement. Workers now exit immediately when the context is cancelled, instead of only exiting when the queue channel closes.

Before:

for req := range queue {
    // Process request
}

After:

for {
    select {
    case req, ok := <-queue:
        if !ok {
            return
        }
        bifrost.processRequest(provider, config, req)
    case <-bifrost.ctx.Done():
        return
    }
}

Streaming Handler Fixes

Added context monitoring goroutines to all streaming handlers that force-close the response body stream when context is cancelled. This unblocks scanner.Scan() calls that were preventing goroutines from exiting.

Pattern applied to all streaming methods:

// Monitor context cancellation and force-close response body to unblock scanner.
done := make(chan struct{})
defer close(done)
go func() {
    select {
    case <-ctx.Done():
        resp.CloseBodyStream()
    case <-done:
    }
}()

Fixed in 11 streaming methods across 4 providers:

OpenAI: 5 methods (TextCompletionStream, ChatCompletionStream, ResponsesStream, SpeechStream, TranscriptionStream)
Anthropic: 2 methods (ChatCompletionStream, ResponsesStream)
Cohere: 2 methods (ChatCompletionStream, ResponsesStream)
Gemini: 2 methods (SpeechStream, TranscriptionStream)

Testing

Added TestBifrostContextCancellationCleanup to verify goroutines clean up properly after context timeout.

Test Results:

Before fixes: 5 goroutines leaked
After fixes: 3 goroutines leaked (40% improvement)

Type of change

Bug fix

Affected areas

Core (Go)

How to test

# Run the goroutine leak test
go test -v -run TestBifrostContextCancellationCleanup ./core

# Run all tests
go test ./...

Breaking changes

Yes
No

Checklist

I read docs/contributing/README.md and followed the guidelines
I added/updated tests where appropriate
I verified builds succeed
I verified the CI pipeline passes locally if applicable

…--skip-pipeline

…ip-pipeline

## Summary Improve the Mintlify changelog generation script to clear changelog files after processing and handle empty changelog content gracefully. ## Changes - Modified the `append_section` function to check if the changelog content is empty before adding it to the output - Added functionality to clear changelog files after they've been processed, preventing duplicate entries in future runs - Improved error handling by removing the early return when content retrieval fails ## Type of change - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [x] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [ ] UI (Next.js) - [x] Docs ## How to test Run the changelog generation script and verify: 1. Empty changelog files are handled properly 2. Changelog files are cleared after processing 3. The generated output contains only non-empty sections ```sh # Run the changelog generation script .github/workflows/scripts/push-mintlify-changelog.sh ``` ## Breaking changes - [ ] Yes - [x] No ## Security considerations No security implications as this only affects the documentation build process. ## Checklist - [x] I read `docs/contributing/README.md` and followed the guidelines - [x] I verified the CI pipeline passes locally if applicable

…_gemini_responses_stream fix: handle panic in gemini responses stream

## Summary Enhance the ListModels functionality to aggregate results from multiple API keys, improving model discovery and availability. ## Changes - Modified `ListModels` interface to accept multiple keys instead of a single key - Implemented key aggregation logic that combines model lists from multiple API keys - Added deduplication of models based on model ID when aggregating results - Created a best-effort approach that continues processing remaining keys even if some fail - Added a new helper function `getAllSupportedKeys` to retrieve all valid keys for a provider - Updated all provider implementations to support the multi-key approach ## Type of change - [ ] Bug fix - [x] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [ ] Transports (HTTP) - [x] Providers/Integrations - [ ] Plugins - [ ] UI (Next.js) - [ ] Docs ## How to test Test the ListModels endpoint with multiple API keys configured for the same provider: ```sh # Core/Transports go version go test ./... # Test with multiple keys for a provider (e.g., OpenAI) curl -X GET "http://localhost:8000/v1/models?provider=openai" -H "Authorization: Bearer your-bifrost-token" ``` ## Breaking changes - [ ] Yes - [x] No ## Related issues Improves model discovery by aggregating results from multiple API keys. ## Security considerations The implementation maintains the same security model as before, with proper error handling to avoid leaking sensitive information. ## Checklist - [x] I added/updated tests where appropriate - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable

…ized request handling (maximhq#724) ## Summary Refactored the Cohere provider to use a unified request handling approach and improved response object pooling for better performance and memory efficiency. ## Changes - Implemented `completeRequest` method in Cohere provider to centralize API request handling - Added object pooling for Cohere embedding responses to reduce memory allocations - Standardized response handling across all Cohere API endpoints - Added id, object, and model fields to Chat Completion responses from Bedrock and Cohere providers - Updated list models functionality to use all available keys ## Type of change - [ ] Bug fix - [x] Feature - [x] Refactor - [ ] Documentation - [x] Chore/CI ## Affected areas - [x] Core (Go) - [x] Transports (HTTP) - [x] Providers/Integrations - [ ] Plugins - [ ] UI (Next.js) - [ ] Docs ## How to test Test the Cohere provider with various request types to ensure proper functionality: ```sh # Core/Transports go version go test ./... # Test Cohere provider specifically go test ./core/providers -run TestCohereProvider ``` ## Breaking changes - [x] No - [ ] Yes ## Related issues Improves performance and standardizes response handling across providers. ## Security considerations No security implications as this is an internal refactoring of existing functionality. ## Checklist - [x] I added/updated tests where appropriate - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable

…pi (maximhq#702) ## Summary Implement proper type handling for Anthropic tools in the schema conversion layer, adding support for specialized tool types like computer use preview, web search, and local shell. ## Changes - Added `ResponsesToolType` as a proper string type with constants for all supported tool types - Enhanced Anthropic tool conversion to properly handle specialized tool types: - Computer use preview tools - Web search tools - Local shell (bash) tools - Added proper type definitions for Anthropic-specific tool structures - Fixed type conversion in the chat tool mux to use the new type system ## Type of change - [x] Feature - [x] Refactor ## Affected areas - [x] Core (Go) - [x] Providers/Integrations ## How to test ```sh # Core/Transports go version go test ./... ``` Test with Anthropic API calls that use specialized tools like computer use preview, web search, and local shell to verify proper conversion between Bifrost and Anthropic formats. ## Breaking changes - [x] No ## Related issues Improves tool handling for Anthropic provider integration. ## Checklist - [x] I added/updated tests where appropriate - [x] I verified builds succeed (Go and UI)

…aximhq#725) ## Summary Add support for storing and displaying Responses API input history in logs, enabling better tracking and debugging of Responses API requests. ## Changes - Added a new `responses_input_history` column to the logs database table - Created a database migration to add the new column - Modified the logging plugin to extract and store Responses API input messages - Updated the UI to display Responses input history in log details - Renamed `LogResponsesOutputView` to `LogResponsesMessageView` for reuse with both input and output - Fixed an issue in the post-hook runner by clearing `ChatResponse` after converting to `ResponsesStreamResponse` ## Type of change - [ ] Bug fix - [x] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [x] Transports (HTTP) - [ ] Providers/Integrations - [x] Plugins - [x] UI (Next.js) - [ ] Docs ## How to test 1. Make Responses API requests 2. Check logs to verify input history is properly stored and displayed 3. Verify that both input and output messages are properly displayed in the UI ```sh # Core/Transports go version go test ./... # UI cd ui pnpm i || npm i pnpm test || npm test pnpm build || npm run build ``` ## Breaking changes - [x] Yes - [ ] No The database schema has been updated with a new column. Users should run the application to trigger the migration. ## Security considerations No new security implications. The implementation follows the same patterns as existing log storage. ## Checklist - [x] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [x] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable

## Summary This PR updates the PreRequestCallback interface to include a Bifrost context parameter, allowing callbacks to modify the context before it's passed to Bifrost. It also adds a rawBody parameter to provide access to the original request body. ## Changes - Modified the `PreRequestCallback` function signature to include `bifrostCtx *context.Context` and `rawBody []byte` parameters - Updated all implementations of `PreRequestCallback` in various integration files (anthropic.go, genai.go, openai.go) - Added a new router.go file that contains the centralized router implementation with the updated interface - Moved most of the code from utils.go to router.go for better organization - Simplified utils.go to contain only utility functions ## Type of change - [ ] Bug fix - [x] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [x] Transports (HTTP) - [x] Providers/Integrations - [ ] Plugins - [ ] UI (Next.js) - [ ] Docs ## How to test Test that pre-request callbacks can now modify the Bifrost context: ```sh # Core/Transports go version go test ./... # Test with a custom integration that uses the PreRequestCallback # to modify the Bifrost context before processing ``` ## Breaking changes - [x] Yes - [ ] No This changes the signature of the `PreRequestCallback` interface. Any custom implementations will need to be updated to match the new signature. ## Related issues Enables better context manipulation in pre-request callbacks, which helps with request processing customization. ## Security considerations No additional security implications. The change maintains the existing security model while providing more flexibility. ## Checklist - [x] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [x] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable

…raw request passthrough and response gzip decompression and anthropic integration passthrough (maximhq#731) ## Summary Add support for passthrough mode to allow direct forwarding of requests to providers without modifying the request body. This enables advanced use cases like Claude Code and other provider-specific features that require exact request formats. ## Changes - Added context keys to support passthrough mode: `BifrostContextKeySkipKeySelection`, `BifrostContextKeyExtraHeaders`, `BifrostContextKeyURLPath`, and `BifrostContextKeyRequestBody` - Implemented passthrough detection for Anthropic API requests - Refactored provider code to check for raw request bodies in context before converting - Added support for preserving original URL paths and headers - Improved error handling for unsupported operations with standardized messages - Fixed handling of response bodies with different content encodings (gzip) - Added support for skipping key selection when appropriate ## Type of change - [ ] Bug fix - [x] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [x] Transports (HTTP) - [x] Providers/Integrations - [ ] Plugins - [ ] UI (Next.js) - [ ] Docs ## How to test Test passthrough mode with Anthropic API: ```sh # Test with Claude Code request curl -X POST http://localhost:8080/anthropic/v1/messages \ -H "Content-Type: application/json" \ -H "anthropic-version: 2023-06-01" \ -d '{ "model": "claude-3-opus-20240229", "messages": [{"role": "user", "content": "Write a simple Python function to calculate Fibonacci numbers"}], "max_tokens": 1000 }' # Test with standard API key curl -X POST http://localhost:8080/anthropic/v1/messages \ -H "Content-Type: application/json" \ -H "x-api-key: your-api-key" \ -H "anthropic-version: 2023-06-01" \ -d '{ "model": "claude-3-opus-20240229", "messages": [{"role": "user", "content": "Hello"}], "max_tokens": 100 }' ``` ## Breaking changes - [ ] Yes - [x] No ## Related issues Enables support for Claude Code and other provider-specific features that require exact request formats. ## Security considerations The implementation preserves security by: - Filtering out hop-by-hop headers that shouldn't be forwarded - Maintaining proper authentication mechanisms - Only allowing passthrough for specific providers and endpoints ## Checklist - [x] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [x] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable

…requests (maximhq#732) ## Summary Implement proper context cancellation for HTTP requests to prevent resource leaks and wasted provider quota when clients disconnect, especially during streaming operations. ## Changes - Added cancellable context support in `ConvertToBifrostContext()` which now returns both a context and a cancel function - Ensured all HTTP handlers properly defer the cancel function to clean up resources - Modified streaming handlers to immediately cancel upstream provider requests when clients disconnect - Added detailed comments explaining the context cancellation pattern and its importance - Fixed potential resource leaks in streaming handlers by properly handling write errors ## Type of change - [x] Bug fix - [ ] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [x] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [ ] UI (Next.js) - [ ] Docs ## How to test Verify that streaming requests are properly cancelled when clients disconnect: ```sh # Start Bifrost server go run cmd/bifrost/main.go # In another terminal, start a streaming request and then cancel it (Ctrl+C) curl -N http://localhost:8000/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{"model": "gpt-3.5-turbo", "messages": [{"role": "user", "content": "Write a very long story"}], "stream": true}' # Verify in server logs that the upstream request was cancelled ``` ## Breaking changes - [ ] Yes - [x] No ## Related issues Addresses resource leaks and wasted provider quota when clients disconnect during streaming requests. ## Security considerations Improves resource management by ensuring provider requests are cancelled when no longer needed, preventing potential resource exhaustion. ## Checklist - [x] I added/updated tests where appropriate - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable

## Summary - Adds dynamic plugin architecture - Add auth for dashboard and inference calls Closes maximhq#246 Closes maximhq#518 ## Changes - What was changed and why - Any notable design decisions or trade-offs ## Type of change - [x] Bug fix - [x] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [x] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (Next.js) - [x] Docs ## How to test Describe the steps to validate this change. Include commands and expected outcomes. ```sh # Core/Transports go version go test ./... # UI cd ui pnpm i || npm i pnpm test || npm test pnpm build || npm run build ``` If adding new configs or environment variables, document them here. ## Screenshots/Recordings If UI changes, add before/after screenshots or short clips. ## Breaking changes - [ ] Yes - [x] No If yes, describe impact and migration instructions. ## Related issues Link related issues and discussions. Example: Closes maximhq#123 ## Security considerations Note any security implications (auth, secrets, PII, sandboxing, etc.). ## Checklist - [x] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [x] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable

version bumps

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (5)

core/providers/gemini/gemini.go (1)

749-758: TranscriptionStream uses the same robust cancellation pattern

This block mirrors the SpeechStream watcher and provides the same cancellation and cleanup guarantees; no additional issues here.

core/providers/openai/openai.go (4)

798-812: Same cancellation watcher pattern applied to chat streams

Chat SSE streaming now uses the same ctx.Done() → CloseBodyStream watcher and done channel to avoid stuck scanners; consistent with the text completion path.

1196-1210: Responses streaming cancellation handling matches other handlers

The responses stream handler uses the same watcher pattern, which should prevent leaks when callers cancel the context mid‑stream.

1593-1607: Speech streaming gains proper cancellation unblocking

For speech SSE, the new watcher ensures the underlying body stream is closed on ctx.Done(), so the scanner loop can exit promptly instead of hanging on I/O.

1878-1892: Transcription streaming cancellation is now aligned with other streams

Transcription SSE streaming now also force‑closes the body stream on context cancellation, bringing it in line with the other OpenAI streaming handlers and avoiding goroutine leaks.

🧹 Nitpick comments (3)

core/bifrost.go (1)
2058-2071: Align key-selection error send with context-aware response sending

In the key-selection failure path you still use a bare send:
req.Err <- schemas.BifrostError{ ... }
return
Whereas later you’ve switched to select-based, context-aware sends with a timeout for both responses and streams. While this error path is expected to be consumed by the same caller, using the same non-blocking pattern here would make behavior more consistent and avoid a potential hard block if the caller has already stopped listening.

You could mirror the later pattern:
-        if err != nil {
-            bifrost.logger.Warn("error selecting key for model %s: %v", model, err)
-            req.Err <- schemas.BifrostError{
-                IsBifrostError: false,
-                Error: &schemas.ErrorField{
-                    Message: err.Error(),
-                    Error:   err,
-                },
-            }
-            return
-        }
+        if err != nil {
+            bifrost.logger.Warn("error selecting key for model %s: %v", model, err)
+            bifrostErr := schemas.BifrostError{
+                IsBifrostError: false,
+                Error: &schemas.ErrorField{
+                    Message: err.Error(),
+                    Error:   err,
+                },
+            }
+            select {
+            case req.Err <- bifrostErr:
+            case <-req.Context.Done():
+                bifrost.logger.Debug("Client context cancelled while sending key-selection error")
+            case <-time.After(5 * time.Second):
+                bifrost.logger.Warn("Timeout while sending key-selection error, client may have disconnected")
+            }
+            return
+        }
Not strictly required for correctness, but it would make this path consistent with the new response/stream handling.
core/bifrost_test.go (1)

965-1048: Goroutine leak test is well‑targeted; consider flakiness and runtime cost

This test does a good job exercising a streaming request under context timeout and asserting goroutines return close to baseline. Two minor considerations:

Using GC + time.Sleep(100 * time.Millisecond) twice plus a 2s timeout and 2s post‑Shutdown sleep can make the test relatively slow; you might be able to reduce the sleeps/timeouts once you’re confident in the fix.

The ≤ 2 goroutine tolerance is quite strict; if CI ever shows sporadic failures due to unrelated background goroutines, slightly widening the tolerance (or aggregating over a few runs) would make it more robust while still catching real leaks.

Functionally this looks correct and gives valuable regression coverage for #828.

core/providers/openai/openai.go (1)

398-411: Context cancellation watcher correctly unblocks text‑completion stream

The added done/ctx.Done() watcher that calls resp.CloseBodyStream() gives the scanner a reliable unblock path on cancellation while ensuring the watcher goroutine exits when the stream routine returns. This directly addresses stuck streaming goroutines.

If you find yourself touching this pattern again, a small helper (e.g., withStreamCancellationWatcher(ctx, resp, func(...) { ... })) shared across providers could reduce duplication and keep behavior consistent.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 18a9284 and e4c277f.

📒 Files selected for processing (15)

core/bifrost.go (1 hunks)
core/bifrost_test.go (2 hunks)
core/providers/anthropic/anthropic.go (2 hunks)
core/providers/anthropic/chat.go (1 hunks)
core/providers/azure/types.go (1 hunks)
core/providers/bedrock/chat.go (2 hunks)
core/providers/cohere/chat.go (1 hunks)
core/providers/cohere/cohere.go (2 hunks)
core/providers/cohere/models.go (1 hunks)
core/providers/gemini/chat.go (1 hunks)
core/providers/gemini/gemini.go (2 hunks)
core/providers/gemini/models.go (2 hunks)
core/providers/openai/openai.go (5 hunks)
core/providers/openai/types.go (1 hunks)
core/providers/vertex/models.go (1 hunks)

🧰 Additional context used

🧬 Code graph analysis (7)

core/providers/gemini/chat.go (2)

core/providers/gemini/types.go (1)

Type (769-769)

core/schemas/chatcompletions.go (1)

ChatToolChoiceTypeFunction (250-250)

core/providers/cohere/models.go (1)

core/schemas/models.go (1)

Model (47-66)

core/bifrost.go (4)

core/schemas/provider.go (4)

Provider (207-234)

ProviderConfig (161-169)

CustomProviderConfig (142-148)

PostHookRunner (204-204)

core/schemas/bifrost.go (9)

BifrostRequest (140-150)

BifrostResponse (243-253)

BifrostStream (315-322)

BifrostError (350-359)

RequestType (81-81)

ErrorField (368-375)

BifrostContextKeySelectedKeyID (107-107)

BifrostContextKeySelectedKeyName (108-108)

BifrostErrorExtraFields (417-421)

core/schemas/account.go (1)

Key (8-17)

core/utils.go (1)

IsStreamRequestType (190-192)

core/bifrost_test.go (7)

core/schemas/bifrost.go (2)

OpenAI (35-35)

BifrostConfig (21-29)

core/bifrost.go (1)

Init (86-214)

core/schemas/account.go (1)

Account (53-69)

core/schemas/logger.go (2)

Logger (28-55)

LogLevelError (13-13)

core/logger.go (1)

NewDefaultLogger (40-49)

core/schemas/chatcompletions.go (3)

BifrostChatRequest (11-18)

ChatMessage (346-355)

ChatMessageContent (358-361)

core/schemas/provider.go (1)

Provider (207-234)

core/providers/gemini/models.go (2)

core/schemas/models.go (1)

Model (47-66)

core/providers/gemini/types.go (1)

GeminiModel (1255-1269)

core/providers/vertex/models.go (1)

core/schemas/models.go (1)

Model (47-66)

core/providers/bedrock/chat.go (1)

core/schemas/chatcompletions.go (1)

ChatAssistantMessageToolCall (483-488)

🔇 Additional comments (16)

core/providers/bedrock/chat.go (2)

90-91: LGTM! Proper string pointer initialization for tool calls.

The formatting changes correctly populate the Type and ID fields as string pointers, matching the schema requirements. The use of &toolUseID is safe since toolUseID is a local copy created on line 85 to avoid range loop variable capture issues.

188-188: LGTM! Standard variable declaration.

The variable declaration formatting is standard and appropriate for Go.

core/providers/azure/types.go (1)

20-30: No-op struct formatting change

Only field alignment on AzureModel.CreatedAt; tags and types are unchanged, so there’s no behavioral impact. Looks good as-is.

core/providers/anthropic/chat.go (1)

704-732: Comment wording/whitespace tweak is accurate

The updated comment above the AnthropicStreamEventTypeContentBlockStart branch still correctly describes emitting tool-call metadata; no code behavior change and the comment remains consistent with the logic.

core/providers/cohere/models.go (1)

10-12: Slice init formatting only

Data: make([]schemas.Model, 0, len(response.Models)) is unchanged semantically from before; preallocation is appropriate and behavior is unaffected.

core/providers/gemini/chat.go (1)

70-80: Tool call struct literal spacing only

The Gemini tool-call construction (ID and Type fields) is unchanged in value and behavior; this is purely a formatting tweak and is consistent with the rest of the file.

core/providers/cohere/chat.go (1)

271-278: Formatting-only change in tool call mapping

The bifrostToolCall literal still maps ID and function fields exactly as before; this is a whitespace-only adjustment with no behavioral impact.

core/providers/gemini/gemini.go (1)

490-499: Context cancellation watcher for SpeechStream looks correct

The done-channel + watcher goroutine cleanly forces resp.CloseBodyStream() on ctx.Done() and guarantees the watcher exits when the main streaming goroutine ends. This should unblock the scanner without introducing leaks.

core/providers/cohere/cohere.go (2)

411-420: ChatCompletionStream cancellation watcher is well-scoped

The added watcher cleanly unblocks the streaming scanner on context cancellation and terminates via the done channel, so workers shouldn’t leak even if callers drop the context mid-stream.

634-643: ResponsesStream cancellation watcher mirrors ChatCompletionStream

Same pattern and guarantees as in ChatCompletionStream; it looks sound and consistent with the rest of the codebase.

core/providers/gemini/models.go (1)

15-16: Formatting-only struct literal changes

The adjustments to slice literal alignment are cosmetic; capacities and types are unchanged, so runtime behavior is identical.

Also applies to: 42-43

core/providers/anthropic/anthropic.go (2)

469-478: Anthropic chat streaming now correctly unblocks on context cancellation

The watcher goroutine plus done channel ensure the BodyStream is closed when ctx.Done() fires and that the watcher terminates with the main streaming goroutine. This aligns Anthropic’s behavior with other providers and should prevent stuck scanners.

735-744: Anthropic ResponsesStream adopts the same safe cancellation pattern

Same watcher pattern as in HandleAnthropicChatCompletionStreaming, providing consistent cancellation behavior across Anthropic streaming APIs.

core/providers/vertex/models.go (1)

10-12: Slice preallocation formatting only

Data still preallocates to len(response.Models); behavior unchanged and correct.

core/providers/openai/types.go (1)

128-134: Struct field whitespace cleanup is safe

Created keeps the same pointer type and JSON tag; no impact on encoding/decoding.

core/bifrost_test.go (1)

3-13: New test imports are appropriate and scoped

runtime and testify/assert are used only for the goroutine-leak test; import block remains tidy.

core/bifrost.go

Pratham-Mishra04 · 2025-11-14T03:45:27Z

Hey @pjcdawkins, thanks for the PR! It’s looking great. There are a few solid Code Rabbit comments to address though. And while you’re at it, please add your update to core/changelog.md and double-check that all tests in tests/core-providers are still passing.

codecov · 2025-11-14T04:39:01Z

Codecov Report

❌ Patch coverage is 22.28916% with 129 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
core/providers/openai/openai.go	0.00%	38 Missing ⚠️
core/bifrost.go	51.38%	32 Missing and 3 partials ⚠️
core/providers/gemini/gemini.go	0.00%	16 Missing ⚠️
core/providers/anthropic/anthropic.go	0.00%	15 Missing ⚠️
core/providers/cohere/cohere.go	0.00%	15 Missing ⚠️
core/providers/bedrock/chat.go	0.00%	3 Missing ⚠️
core/providers/gemini/chat.go	0.00%	2 Missing ⚠️
core/providers/gemini/models.go	0.00%	2 Missing ⚠️
core/providers/cohere/chat.go	0.00%	1 Missing ⚠️
core/providers/cohere/models.go	0.00%	1 Missing ⚠️
... and 1 more

📢 Thoughts on this report? Let us know!

…extraction (maximhq#789) ## Summary Enhance changelog generation by adding version information to update labels in the Mintlify documentation. ## Changes - Renamed `get_changelog_content` to `get_file_content` to better reflect its more general purpose - Added support for reading version files and trimming whitespace/newlines from version content - Modified the `append_section` function to include version information from a separate version file in the "description" field of Update tags - Added error handling for empty changelog content - Improved file existence checking to prevent errors when files don't exist ## Type of change - [x] Feature - [x] Refactor ## Affected areas - [x] Docs - [x] Chore/CI ## How to test Run the changelog generation workflow manually: ```sh # Test the script with a sample changelog and version file mkdir -p test/changelog echo "- New feature added" > test/changelog/changelog.md echo "1.2.3" > test/changelog/version .github/workflows/scripts/push-mintlify-changelog.sh ``` ## Related issues Improves documentation automation by including version information in changelog entries. ## Checklist - [x] I verified the CI pipeline passes locally if applicable - [x] I updated documentation where needed

This commit addresses two critical concurrency issues identified in the PR review: 1. Worker exit on context cancellation leaving queued requests without responses: - Reverted worker loop to simple `for req := range queue` pattern - Workers now only exit when queue is closed, ensuring all queued requests are processed - This matches the pre-PR implementation and prevents dropped requests during shutdown 2. Streaming plugin pipeline data race: - Removed pipeline release for streaming requests to prevent data race - Pipeline is acquired once and captured in postHookRunner closure - Pipeline will be garbage collected when no longer referenced - Trade-off: pipelines are not reused for streaming requests, but data race is avoided Additional improvements: - Improved code comments in Shutdown() method for clarity - Maintained original shutdown sequence: cancel context first (stops provider goroutines), then close queues (signals workers), then wait for workers to drain and exit Reference: maximhq#829 (comment) Reference: maximhq#829 (comment) Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

core/changelog.md (1)

1-1: Changelog entry accurately captures the fix.

The entry correctly documents the goroutine leak fix across both the worker loop and streaming handlers, addressing the PR objectives. The concise format is appropriate for a changelog.

Minor enhancement: Consider mentioning "context cancellation" for clarity, since that's the mechanism driving the fix—e.g., "fix: Goroutine leaks during context cancellation in worker loop and streaming request handlers". This helps users understand the specific scenario that was fixed.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8028533 and 6d849f0.

📒 Files selected for processing (1)

core/changelog.md (1 hunks)

Workers were blocking on queue channel receive when the context was cancelled, causing goroutine leaks that could persist beyond the context timeout. This fix ensures workers exit promptly when the context is cancelled by: - Listening for context cancellation in a select statement - Draining any remaining queued requests and sending cancellation errors - Preventing callers from hanging indefinitely waiting for responses This addresses the scenario where context timeout occurs without explicit Shutdown() being called, which was causing workers to remain blocked waiting for the queue to be closed. Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

core/bifrost.go (1)
2564-2569: Document expected Shutdown usage to avoid race condition.

The shutdown sequence (cancel context → close queues → wait for workers) is intentional to signal streaming goroutines to unblock. However, this creates a brief window where new requests could be enqueued after context cancellation but before queues are closed.

Recommend documenting that Shutdown() should only be called when no new requests are actively being submitted:
-// Shutdown gracefully stops all workers when triggered.
-// It closes all request channels and waits for workers to exit.
+// Shutdown gracefully stops all workers when triggered.
+// It cancels the bifrost context to signal streaming goroutines to stop,
+// closes all request channels, and waits for workers to exit.
+//
+// Important: Shutdown should be called only when no new requests are being
+// submitted to avoid race conditions where requests could be enqueued after
+// context cancellation but before queue closure.
 func (bifrost *Bifrost) Shutdown() {
This clarifies the contract for callers.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6d849f0 and c80c51b.

📒 Files selected for processing (1)

core/bifrost.go (2 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

core/bifrost.go (4)

core/schemas/bifrost.go (9)

BifrostError (350-359)

ErrorField (368-375)

BifrostRequest (140-150)

BifrostResponse (243-253)

BifrostStream (315-322)

RequestType (81-81)

BifrostContextKeySelectedKeyID (107-107)

BifrostContextKeySelectedKeyName (108-108)

BifrostErrorExtraFields (417-421)

core/schemas/provider.go (4)

Provider (207-234)

ProviderConfig (161-169)

CustomProviderConfig (142-148)

PostHookRunner (204-204)

core/schemas/account.go (1)

Key (8-17)

core/utils.go (1)

IsStreamRequestType (190-192)

🔇 Additional comments (2)

core/bifrost.go (2)

2102-2131: Plugin pipeline pooling disabled for streaming to avoid data race (acknowledged trade-off).

The code correctly avoids releasing the plugin pipeline for streaming requests because provider goroutines asynchronously invoke postHookRunner, which would access a pipeline that might have been reset and reused by another request.

This addresses the critical data race flagged in the past review comments. The trade-off (no pooling for streaming) is reasonable given that correctness is prioritized over efficiency.

Note: Streaming requests will allocate a new pipeline for each request instead of reusing from the pool. Under high streaming load, this may increase GC pressure, but it ensures thread safety.

2139-2176: Excellent defensive programming with context-aware sends.

The context-aware response/error/stream sends with timeouts (lines 2140-2176) effectively prevent workers from blocking indefinitely when clients disconnect or stop listening. This ensures that workers can be reclaimed even if clients misbehave.

The 5-second timeout is a reasonable balance between accommodating slow clients and preventing indefinite blocking.

core/bifrost.go

…nts (maximhq#815) ## Summary Enhances the Responses API streaming implementation to better align with OpenAI's streaming format, providing more consistent and reliable tool call streaming across providers. ## Changes - Refactored Anthropic's Responses API streaming to use a stateful accumulator for tracking output items - Added similar streaming accumulators to Bedrock and Cohere providers - Implemented proper OpenAI-style lifecycle events (created, in_progress, completed) for all providers - Improved tool call streaming with proper output indexing and argument accumulation - Replaced the chat-to-responses conversion approach with native implementations for each provider - Added comprehensive tests for tool call streaming functionality ## Type of change - [ ] Bug fix - [x] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [ ] Transports (HTTP) - [x] Providers/Integrations - [ ] Plugins - [ ] UI (Next.js) - [ ] Docs ## How to test Test the Responses API streaming with tool calls across different providers: ```sh # Run the tool calls streaming tests go test ./tests/core-providers -run TestOpenAI/ToolCallsStreamingResponses go test ./tests/core-providers -run TestAnthropic/ToolCallsStreamingResponses go test ./tests/core-providers -run TestBedrock/ToolCallsStreamingResponses go test ./tests/core-providers -run TestCohere/ToolCallsStreamingResponses ``` ## Breaking changes - [ ] Yes - [x] No ## Related issues Improves the reliability of tool call streaming in the Responses API. ## Security considerations No security implications. ## Checklist - [x] I added/updated tests where appropriate - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable

) ## Summary Added model pricing data support to the Bifrost core, enabling automatic pricing information in model listings and providing a mechanism to sync pricing data from external sources. ## Changes - Added a `pricingData` field to the Bifrost struct to store pricing information for models - Created new methods to set and retrieve pricing data for models - Added a `DataSheetPricingEntry` struct to represent detailed pricing information - Enhanced `BifrostListModelsResponse` with an `AddPricing` method to include pricing in model listings - Integrated pricing data with the model catalog framework - Added a callback mechanism to update pricing data when synced from external sources - Added `DeploymentName` field to the Model struct ## Type of change - [x] Feature - [x] Refactor ## Affected areas - [x] Core (Go) - [x] Transports (HTTP) ## How to test ```sh # Core/Transports go version go test ./... # Test model listing with pricing data curl -X GET "http://localhost:8000/v1/models" -H "Authorization: Bearer $BIFROST_API_KEY" ``` ## Breaking changes - [x] No ## Related issues Enhances model listing capabilities with pricing information ## Security considerations No security implications as this only adds informational data to model responses. ## Checklist - [x] I added/updated tests where appropriate - [x] I verified builds succeed (Go and UI)

## Summary Enhance test coverage workflow with improved integration testing capabilities and fix AWS Bedrock URL encoding issues. ## Changes - Enhanced test coverage workflow to include UI build, Redis and Weaviate services for integration tests, and plugin rebuilding - Fixed AWS Bedrock URL encoding by properly handling percent-encoded sequences in query parameters - Added GORM logger implementations for configstore and logstore to improve database logging - Updated core-chatbot test module to work with latest Bifrost API changes - Fixed model catalog pricing lookup method naming for consistency ## Type of change - [x] Bug fix - [x] Feature - [x] Refactor - [ ] Documentation - [x] Chore/CI ## Affected areas - [x] Core (Go) - [ ] Transports (HTTP) - [x] Providers/Integrations - [x] Plugins - [x] UI (Next.js) - [ ] Docs ## How to test ```sh # Test AWS Bedrock URL encoding fix go test ./core/providers/bedrock -v # Test the enhanced test coverage workflow cd .github/workflows ./test-coverage.yml # Test GORM logger implementations go test ./framework/configstore -v go test ./framework/logstore -v # Test core-chatbot with latest API changes cd tests/core-chatbot go test -v ``` ## Breaking changes - [ ] Yes - [x] No ## Related issues Fixes AWS Bedrock URL encoding issues with special characters in query parameters. ## Security considerations The AWS Bedrock URL encoding fix improves request signing security by properly handling percent-encoded sequences. ## Checklist - [x] I added/updated tests where appropriate - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable

When the context is cancelled, the monitor goroutine force-closes the response body stream to unblock scanner.Scan(). This causes scanner.Err() to return an error, which was being treated as a real stream error and logged/sent to clients. This commit updates all streaming handlers to check if the context was cancelled before treating scanner errors as failures. When ctx.Err() is not nil, scanner errors are expected and should be ignored. Fixed in providers: - OpenAI: 5 streaming methods - Anthropic: 2 streaming methods - Cohere: 2 streaming methods - Gemini: 2 streaming methods This prevents spurious error logs and error responses when requests timeout or are cancelled. Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

The previous drain loop used a non-blocking select with a default case that could exit prematurely if the queue appeared empty. This created a race condition where requests enqueued between context cancellation and queue closure would be missed. Changed to use blocking receive that continues until the queue is actually closed, ensuring all queued requests receive cancellation errors. Also added req.Context.Done() check to avoid sending errors to clients that already cancelled their requests. Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

The previous approach of calling resp.CloseBodyStream() to unblock scanner.Scan() on context cancellation was causing nil pointer panics in fasthttp's internals. Investigation revealed that fasthttp does not support context cancellation for streaming response bodies. The BodyStream() cannot be safely interrupted once scanner.Scan() is blocking on I/O. This commit removes: - Panic recovery in streaming goroutines - Context monitor goroutines that called CloseBodyStream() - All CloseBodyStream() calls The streaming handlers now rely on: - Context checks between chunks (select case <-ctx.Done()) - Natural TCP connection timeouts for hung streams - Scanner error handling (ignoring errors when context is cancelled) This is a simpler, safer approach. While streams may take longer to clean up if they hang (30-90s TCP timeout vs immediate), this is rare in practice and avoids the complexity and crashes of forced closes. Fixed in providers: - OpenAI: 5 streaming methods - Anthropic: 2 streaming methods - Cohere: 2 streaming methods - Gemini: 2 streaming methods Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Critical bug: Shutdown() was returning early if the context was already cancelled, which prevented queues from being closed and workers from exiting. Scenario that triggered the bug: 1. User passes timeout context to Init(ctx, ...) 2. Timeout expires, bifrost.ctx becomes cancelled 3. Workers detect ctx.Done() and enter drain mode, waiting for queue closure 4. User calls Shutdown() to clean up 5. Shutdown() sees context is already done and returns early 6. Queues never get closed, workers stuck in drain loop forever Fix: - Remove early return when context is already cancelled - Only skip cancel() call if context already cancelled - Always close queues (with safety check to avoid double-close panic) - Always wait for workers to exit - Always run cleanup for MCP manager and plugins This ensures workers in drain mode always receive the queue closure signal they're waiting for, regardless of when Shutdown() is called. Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

…q#833) ## Summary Fixed issues with custom keyless providers during initial list models requests, ensuring proper initialization and handling of providers that don't require API keys. ## Changes - Added a check to skip key retrieval for providers that don't require keys - Implemented auto-initialization of providers when they're not yet initialized but have a valid config - Added support for keyless model listing in Anthropic, Cohere, Gemini, and OpenAI providers - Updated provider implementations to handle keyless configurations properly ## Type of change - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [x] Transports (HTTP) - [x] Providers/Integrations - [ ] Plugins - [ ] UI (Next.js) - [ ] Docs ## How to test Test listing models with custom keyless providers: ```sh # Core/Transports go version go test ./... # Test with a custom keyless provider configuration curl -X GET "http://localhost:8000/v1/models" -H "Authorization: Bearer your_token" ``` ## Breaking changes - [ ] Yes - [x] No ## Related issues Fixes issues with custom keyless providers failing to initialize or list models properly. ## Security considerations This change maintains the security model by properly handling keyless provider configurations without exposing sensitive information. ## Checklist - [x] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [x] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable

Add configurable stream inactivity timeout to NetworkConfig to allow customization of the timeout duration for detecting and force-closing hung streams. Previously hardcoded to 60 seconds. Changes: - Add StreamInactivityTimeoutInSeconds field to NetworkConfig - Add DefaultStreamInactivityTimeoutInSeconds constant (60 seconds) - Update CheckAndSetDefaults to set default if not provided - Update all streaming methods across providers to use configurable value: - OpenAI: 5 streaming methods - Anthropic: 2 streaming methods (including shared function) - Cohere: 2 streaming methods - Gemini: 2 streaming methods The timeout can now be configured per provider via the NetworkConfig in the provider configuration. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Update all provider calls to shared OpenAI and Anthropic streaming functions to pass the new inactivityTimeoutSeconds parameter. This fixes build errors after adding the configurable timeout parameter to the shared streaming function signatures. Changes: - Update function signatures for shared streaming handlers to accept timeout parameter - Add provider.networkConfig.StreamInactivityTimeoutInSeconds parameter to all streaming function calls across providers: - cerebras, groq, ollama, openrouter, parasail, sgl - azure, mistral, perplexity, gemini, vertex - Fix timeout checks in OpenAI shared functions to use parameter instead of accessing undefined provider variable 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

Pratham-Mishra04 · 2025-11-14T13:35:35Z

core/providers/anthropic/anthropic.go

 			return
 		}

+		// Track last activity time for inactivity timeout detection


@pjcdawkins can you have a common util for this? also lets add a field in the UI as well for stream activity timeout in network config

Pratham-Mishra04 · 2025-11-15T06:05:49Z

@pjcdawkins lets pls also add the inactivity check in speech and transcription streaming

pjcdawkins · 2025-11-15T12:26:01Z

Good points - I set this as a draft because it's a relatively hard problem and I'll come back to it soon! But it seems important. I think it'll need a few more tests to demonstrate the problem(s) and fixes, and some consolidation as you say.

Pratham-Mishra04 · 2025-11-15T12:28:43Z

Yes sure no worries, lmk if you need a hand😄

akshaydeo · 2025-11-17T08:43:07Z

also could you rebase this main with our latest main. we had to nuke one file from the history - which had to be done via rewrite.

github-actions bot and others added 30 commits October 29, 2025 21:54

plugins/semanticcache: bump core to v1.2.16 and framework to v1.1.19 …

996b1f4

…--skip-pipeline

plugins/telemetry: bump core to v1.2.16 and framework to v1.1.19 --sk…

998dbc1

…ip-pipeline

transports: update dependencies --skip-pipeline

287c731

brings back core tests

2ac02bb

Apply suggestions from code review

800204a

fix: handle panic in gemini responses stream

9ce1f7f

Merge pull request maximhq#723 from maximhq/10-31-fix_handle_panic_in…

e4c45fa

…_gemini_responses_stream fix: handle panic in gemini responses stream

refactor: integration router refactored

0c23bdb

feat: added ctx level support for extra headers and url path

08f03e6

feat: anthropic oauth support added for max accounts

50ef033

feat: core providers standardization refactors

ea67394

feat: propgate ctx cancellation from connection drop in transports

835b846

dynamic plugin architecture

be99109

adds dynamic ui

06ddafa

separate schema package wip

827d5d9

move all providers to use fasthttp even for streaming

d5cf3ab

adds auth for web ui

8aff89b

provider refactor fixes

b84e3ca

feat: cohere chat streaming fixes

6d71f3e

version bumps

309c20a

Merge pull request maximhq#737 from maximhq/11-04-version_bumps

99b9f2c

version bumps

coderabbitai bot requested changes Nov 14, 2025

View reviewed changes

core/bifrost.go Show resolved Hide resolved

core/bifrost.go Show resolved Hide resolved

Pratham-Mishra04 and others added 2 commits November 14, 2025 12:48

coderabbitai bot approved these changes Nov 14, 2025

View reviewed changes

Update changelog

6d849f0

coderabbitai bot reviewed Nov 14, 2025

View reviewed changes

coderabbitai bot requested changes Nov 14, 2025

View reviewed changes

core/bifrost.go Outdated Show resolved Hide resolved

Pratham-Mishra04 and others added 4 commits November 14, 2025 15:59

adds gorm logger

3765d4d

pjcdawkins marked this pull request as draft November 14, 2025 10:55

pjcdawkins and others added 6 commits November 14, 2025 11:00

coderabbitai bot approved these changes Nov 14, 2025

View reviewed changes

pjcdawkins and others added 2 commits November 14, 2025 12:42

Merge branch 'main' into fix-goroutine-leaks-828

2718e35

Pratham-Mishra04 reviewed Nov 14, 2025

View reviewed changes

akshaydeo force-pushed the main branch from 0fb7ce2 to 22c0d41 Compare November 17, 2025 06:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix goroutine leaks on context cancellation #829

Fix goroutine leaks on context cancellation #829

Uh oh!

pjcdawkins commented Nov 14, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Pratham-Mishra04 commented Nov 14, 2025

Uh oh!

codecov bot commented Nov 14, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Pratham-Mishra04 Nov 14, 2025

Uh oh!

Pratham-Mishra04 commented Nov 15, 2025

Uh oh!

pjcdawkins commented Nov 15, 2025

Uh oh!

Pratham-Mishra04 commented Nov 15, 2025

Uh oh!

akshaydeo commented Nov 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Fix goroutine leaks on context cancellation #829

Are you sure you want to change the base?

Fix goroutine leaks on context cancellation #829

Uh oh!

Conversation

pjcdawkins commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Worker Loop Fix

Streaming Handler Fixes

Testing

Type of change

Affected areas

How to test

Breaking changes

Checklist

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Pratham-Mishra04 commented Nov 14, 2025

Uh oh!

codecov bot commented Nov 14, 2025

Codecov Report

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Pratham-Mishra04 Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Pratham-Mishra04 commented Nov 15, 2025

Uh oh!

pjcdawkins commented Nov 15, 2025

Uh oh!

Pratham-Mishra04 commented Nov 15, 2025

Uh oh!

akshaydeo commented Nov 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

pjcdawkins commented Nov 14, 2025 •

edited

Loading