-
Notifications
You must be signed in to change notification settings - Fork 127
Fix goroutine leaks on context cancellation #829
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
## Summary Improve the Mintlify changelog generation script to clear changelog files after processing and handle empty changelog content gracefully. ## Changes - Modified the `append_section` function to check if the changelog content is empty before adding it to the output - Added functionality to clear changelog files after they've been processed, preventing duplicate entries in future runs - Improved error handling by removing the early return when content retrieval fails ## Type of change - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [x] Chore/CI ## Affected areas - [ ] Core (Go) - [ ] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [ ] UI (Next.js) - [x] Docs ## How to test Run the changelog generation script and verify: 1. Empty changelog files are handled properly 2. Changelog files are cleared after processing 3. The generated output contains only non-empty sections ```sh # Run the changelog generation script .github/workflows/scripts/push-mintlify-changelog.sh ``` ## Breaking changes - [ ] Yes - [x] No ## Security considerations No security implications as this only affects the documentation build process. ## Checklist - [x] I read `docs/contributing/README.md` and followed the guidelines - [x] I verified the CI pipeline passes locally if applicable
…_gemini_responses_stream fix: handle panic in gemini responses stream
## Summary Enhance the ListModels functionality to aggregate results from multiple API keys, improving model discovery and availability. ## Changes - Modified `ListModels` interface to accept multiple keys instead of a single key - Implemented key aggregation logic that combines model lists from multiple API keys - Added deduplication of models based on model ID when aggregating results - Created a best-effort approach that continues processing remaining keys even if some fail - Added a new helper function `getAllSupportedKeys` to retrieve all valid keys for a provider - Updated all provider implementations to support the multi-key approach ## Type of change - [ ] Bug fix - [x] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [ ] Transports (HTTP) - [x] Providers/Integrations - [ ] Plugins - [ ] UI (Next.js) - [ ] Docs ## How to test Test the ListModels endpoint with multiple API keys configured for the same provider: ```sh # Core/Transports go version go test ./... # Test with multiple keys for a provider (e.g., OpenAI) curl -X GET "http://localhost:8000/v1/models?provider=openai" -H "Authorization: Bearer your-bifrost-token" ``` ## Breaking changes - [ ] Yes - [x] No ## Related issues Improves model discovery by aggregating results from multiple API keys. ## Security considerations The implementation maintains the same security model as before, with proper error handling to avoid leaking sensitive information. ## Checklist - [x] I added/updated tests where appropriate - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable
…ized request handling (maximhq#724) ## Summary Refactored the Cohere provider to use a unified request handling approach and improved response object pooling for better performance and memory efficiency. ## Changes - Implemented `completeRequest` method in Cohere provider to centralize API request handling - Added object pooling for Cohere embedding responses to reduce memory allocations - Standardized response handling across all Cohere API endpoints - Added id, object, and model fields to Chat Completion responses from Bedrock and Cohere providers - Updated list models functionality to use all available keys ## Type of change - [ ] Bug fix - [x] Feature - [x] Refactor - [ ] Documentation - [x] Chore/CI ## Affected areas - [x] Core (Go) - [x] Transports (HTTP) - [x] Providers/Integrations - [ ] Plugins - [ ] UI (Next.js) - [ ] Docs ## How to test Test the Cohere provider with various request types to ensure proper functionality: ```sh # Core/Transports go version go test ./... # Test Cohere provider specifically go test ./core/providers -run TestCohereProvider ``` ## Breaking changes - [x] No - [ ] Yes ## Related issues Improves performance and standardizes response handling across providers. ## Security considerations No security implications as this is an internal refactoring of existing functionality. ## Checklist - [x] I added/updated tests where appropriate - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable
…pi (maximhq#702) ## Summary Implement proper type handling for Anthropic tools in the schema conversion layer, adding support for specialized tool types like computer use preview, web search, and local shell. ## Changes - Added `ResponsesToolType` as a proper string type with constants for all supported tool types - Enhanced Anthropic tool conversion to properly handle specialized tool types: - Computer use preview tools - Web search tools - Local shell (bash) tools - Added proper type definitions for Anthropic-specific tool structures - Fixed type conversion in the chat tool mux to use the new type system ## Type of change - [x] Feature - [x] Refactor ## Affected areas - [x] Core (Go) - [x] Providers/Integrations ## How to test ```sh # Core/Transports go version go test ./... ``` Test with Anthropic API calls that use specialized tools like computer use preview, web search, and local shell to verify proper conversion between Bifrost and Anthropic formats. ## Breaking changes - [x] No ## Related issues Improves tool handling for Anthropic provider integration. ## Checklist - [x] I added/updated tests where appropriate - [x] I verified builds succeed (Go and UI)
…aximhq#725) ## Summary Add support for storing and displaying Responses API input history in logs, enabling better tracking and debugging of Responses API requests. ## Changes - Added a new `responses_input_history` column to the logs database table - Created a database migration to add the new column - Modified the logging plugin to extract and store Responses API input messages - Updated the UI to display Responses input history in log details - Renamed `LogResponsesOutputView` to `LogResponsesMessageView` for reuse with both input and output - Fixed an issue in the post-hook runner by clearing `ChatResponse` after converting to `ResponsesStreamResponse` ## Type of change - [ ] Bug fix - [x] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [x] Transports (HTTP) - [ ] Providers/Integrations - [x] Plugins - [x] UI (Next.js) - [ ] Docs ## How to test 1. Make Responses API requests 2. Check logs to verify input history is properly stored and displayed 3. Verify that both input and output messages are properly displayed in the UI ```sh # Core/Transports go version go test ./... # UI cd ui pnpm i || npm i pnpm test || npm test pnpm build || npm run build ``` ## Breaking changes - [x] Yes - [ ] No The database schema has been updated with a new column. Users should run the application to trigger the migration. ## Security considerations No new security implications. The implementation follows the same patterns as existing log storage. ## Checklist - [x] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [x] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable
## Summary This PR updates the PreRequestCallback interface to include a Bifrost context parameter, allowing callbacks to modify the context before it's passed to Bifrost. It also adds a rawBody parameter to provide access to the original request body. ## Changes - Modified the `PreRequestCallback` function signature to include `bifrostCtx *context.Context` and `rawBody []byte` parameters - Updated all implementations of `PreRequestCallback` in various integration files (anthropic.go, genai.go, openai.go) - Added a new router.go file that contains the centralized router implementation with the updated interface - Moved most of the code from utils.go to router.go for better organization - Simplified utils.go to contain only utility functions ## Type of change - [ ] Bug fix - [x] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [ ] Core (Go) - [x] Transports (HTTP) - [x] Providers/Integrations - [ ] Plugins - [ ] UI (Next.js) - [ ] Docs ## How to test Test that pre-request callbacks can now modify the Bifrost context: ```sh # Core/Transports go version go test ./... # Test with a custom integration that uses the PreRequestCallback # to modify the Bifrost context before processing ``` ## Breaking changes - [x] Yes - [ ] No This changes the signature of the `PreRequestCallback` interface. Any custom implementations will need to be updated to match the new signature. ## Related issues Enables better context manipulation in pre-request callbacks, which helps with request processing customization. ## Security considerations No additional security implications. The change maintains the existing security model while providing more flexibility. ## Checklist - [x] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [x] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable
…raw request passthrough and response gzip decompression and anthropic integration passthrough (maximhq#731) ## Summary Add support for passthrough mode to allow direct forwarding of requests to providers without modifying the request body. This enables advanced use cases like Claude Code and other provider-specific features that require exact request formats. ## Changes - Added context keys to support passthrough mode: `BifrostContextKeySkipKeySelection`, `BifrostContextKeyExtraHeaders`, `BifrostContextKeyURLPath`, and `BifrostContextKeyRequestBody` - Implemented passthrough detection for Anthropic API requests - Refactored provider code to check for raw request bodies in context before converting - Added support for preserving original URL paths and headers - Improved error handling for unsupported operations with standardized messages - Fixed handling of response bodies with different content encodings (gzip) - Added support for skipping key selection when appropriate ## Type of change - [ ] Bug fix - [x] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [x] Transports (HTTP) - [x] Providers/Integrations - [ ] Plugins - [ ] UI (Next.js) - [ ] Docs ## How to test Test passthrough mode with Anthropic API: ```sh # Test with Claude Code request curl -X POST http://localhost:8080/anthropic/v1/messages \ -H "Content-Type: application/json" \ -H "anthropic-version: 2023-06-01" \ -d '{ "model": "claude-3-opus-20240229", "messages": [{"role": "user", "content": "Write a simple Python function to calculate Fibonacci numbers"}], "max_tokens": 1000 }' # Test with standard API key curl -X POST http://localhost:8080/anthropic/v1/messages \ -H "Content-Type: application/json" \ -H "x-api-key: your-api-key" \ -H "anthropic-version: 2023-06-01" \ -d '{ "model": "claude-3-opus-20240229", "messages": [{"role": "user", "content": "Hello"}], "max_tokens": 100 }' ``` ## Breaking changes - [ ] Yes - [x] No ## Related issues Enables support for Claude Code and other provider-specific features that require exact request formats. ## Security considerations The implementation preserves security by: - Filtering out hop-by-hop headers that shouldn't be forwarded - Maintaining proper authentication mechanisms - Only allowing passthrough for specific providers and endpoints ## Checklist - [x] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [x] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable
…requests (maximhq#732) ## Summary Implement proper context cancellation for HTTP requests to prevent resource leaks and wasted provider quota when clients disconnect, especially during streaming operations. ## Changes - Added cancellable context support in `ConvertToBifrostContext()` which now returns both a context and a cancel function - Ensured all HTTP handlers properly defer the cancel function to clean up resources - Modified streaming handlers to immediately cancel upstream provider requests when clients disconnect - Added detailed comments explaining the context cancellation pattern and its importance - Fixed potential resource leaks in streaming handlers by properly handling write errors ## Type of change - [x] Bug fix - [ ] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [x] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [ ] UI (Next.js) - [ ] Docs ## How to test Verify that streaming requests are properly cancelled when clients disconnect: ```sh # Start Bifrost server go run cmd/bifrost/main.go # In another terminal, start a streaming request and then cancel it (Ctrl+C) curl -N http://localhost:8000/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{"model": "gpt-3.5-turbo", "messages": [{"role": "user", "content": "Write a very long story"}], "stream": true}' # Verify in server logs that the upstream request was cancelled ``` ## Breaking changes - [ ] Yes - [x] No ## Related issues Addresses resource leaks and wasted provider quota when clients disconnect during streaming requests. ## Security considerations Improves resource management by ensuring provider requests are cancelled when no longer needed, preventing potential resource exhaustion. ## Checklist - [x] I added/updated tests where appropriate - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable
## Summary - Adds dynamic plugin architecture - Add auth for dashboard and inference calls Closes maximhq#246 Closes maximhq#518 ## Changes - What was changed and why - Any notable design decisions or trade-offs ## Type of change - [x] Bug fix - [x] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [x] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [x] UI (Next.js) - [x] Docs ## How to test Describe the steps to validate this change. Include commands and expected outcomes. ```sh # Core/Transports go version go test ./... # UI cd ui pnpm i || npm i pnpm test || npm test pnpm build || npm run build ``` If adding new configs or environment variables, document them here. ## Screenshots/Recordings If UI changes, add before/after screenshots or short clips. ## Breaking changes - [ ] Yes - [x] No If yes, describe impact and migration instructions. ## Related issues Link related issues and discussions. Example: Closes maximhq#123 ## Security considerations Note any security implications (auth, secrets, PII, sandboxing, etc.). ## Checklist - [x] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [x] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable
version bumps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (5)
core/providers/gemini/gemini.go (1)
749-758: TranscriptionStream uses the same robust cancellation patternThis block mirrors the SpeechStream watcher and provides the same cancellation and cleanup guarantees; no additional issues here.
core/providers/openai/openai.go (4)
798-812: Same cancellation watcher pattern applied to chat streamsChat SSE streaming now uses the same
ctx.Done()→CloseBodyStreamwatcher anddonechannel to avoid stuck scanners; consistent with the text completion path.
1196-1210: Responses streaming cancellation handling matches other handlersThe responses stream handler uses the same watcher pattern, which should prevent leaks when callers cancel the context mid‑stream.
1593-1607: Speech streaming gains proper cancellation unblockingFor speech SSE, the new watcher ensures the underlying body stream is closed on
ctx.Done(), so the scanner loop can exit promptly instead of hanging on I/O.
1878-1892: Transcription streaming cancellation is now aligned with other streamsTranscription SSE streaming now also force‑closes the body stream on context cancellation, bringing it in line with the other OpenAI streaming handlers and avoiding goroutine leaks.
🧹 Nitpick comments (3)
core/bifrost.go (1)
2058-2071: Align key-selection error send with context-aware response sendingIn the key-selection failure path you still use a bare send:
req.Err <- schemas.BifrostError{ ... } returnWhereas later you’ve switched to
select-based, context-aware sends with a timeout for both responses and streams. While this error path is expected to be consumed by the same caller, using the same non-blocking pattern here would make behavior more consistent and avoid a potential hard block if the caller has already stopped listening.You could mirror the later pattern:
- if err != nil { - bifrost.logger.Warn("error selecting key for model %s: %v", model, err) - req.Err <- schemas.BifrostError{ - IsBifrostError: false, - Error: &schemas.ErrorField{ - Message: err.Error(), - Error: err, - }, - } - return - } + if err != nil { + bifrost.logger.Warn("error selecting key for model %s: %v", model, err) + bifrostErr := schemas.BifrostError{ + IsBifrostError: false, + Error: &schemas.ErrorField{ + Message: err.Error(), + Error: err, + }, + } + select { + case req.Err <- bifrostErr: + case <-req.Context.Done(): + bifrost.logger.Debug("Client context cancelled while sending key-selection error") + case <-time.After(5 * time.Second): + bifrost.logger.Warn("Timeout while sending key-selection error, client may have disconnected") + } + return + }Not strictly required for correctness, but it would make this path consistent with the new response/stream handling.
core/bifrost_test.go (1)
965-1048: Goroutine leak test is well‑targeted; consider flakiness and runtime costThis test does a good job exercising a streaming request under context timeout and asserting goroutines return close to baseline. Two minor considerations:
- Using GC +
time.Sleep(100 * time.Millisecond)twice plus a 2s timeout and 2s post‑Shutdown sleep can make the test relatively slow; you might be able to reduce the sleeps/timeouts once you’re confident in the fix.- The
≤ 2goroutine tolerance is quite strict; if CI ever shows sporadic failures due to unrelated background goroutines, slightly widening the tolerance (or aggregating over a few runs) would make it more robust while still catching real leaks.Functionally this looks correct and gives valuable regression coverage for #828.
core/providers/openai/openai.go (1)
398-411: Context cancellation watcher correctly unblocks text‑completion streamThe added
done/ctx.Done()watcher that callsresp.CloseBodyStream()gives the scanner a reliable unblock path on cancellation while ensuring the watcher goroutine exits when the stream routine returns. This directly addresses stuck streaming goroutines.If you find yourself touching this pattern again, a small helper (e.g.,
withStreamCancellationWatcher(ctx, resp, func(...) { ... })) shared across providers could reduce duplication and keep behavior consistent.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (15)
core/bifrost.go(1 hunks)core/bifrost_test.go(2 hunks)core/providers/anthropic/anthropic.go(2 hunks)core/providers/anthropic/chat.go(1 hunks)core/providers/azure/types.go(1 hunks)core/providers/bedrock/chat.go(2 hunks)core/providers/cohere/chat.go(1 hunks)core/providers/cohere/cohere.go(2 hunks)core/providers/cohere/models.go(1 hunks)core/providers/gemini/chat.go(1 hunks)core/providers/gemini/gemini.go(2 hunks)core/providers/gemini/models.go(2 hunks)core/providers/openai/openai.go(5 hunks)core/providers/openai/types.go(1 hunks)core/providers/vertex/models.go(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (7)
core/providers/gemini/chat.go (2)
core/providers/gemini/types.go (1)
Type(769-769)core/schemas/chatcompletions.go (1)
ChatToolChoiceTypeFunction(250-250)
core/providers/cohere/models.go (1)
core/schemas/models.go (1)
Model(47-66)
core/bifrost.go (4)
core/schemas/provider.go (4)
Provider(207-234)ProviderConfig(161-169)CustomProviderConfig(142-148)PostHookRunner(204-204)core/schemas/bifrost.go (9)
BifrostRequest(140-150)BifrostResponse(243-253)BifrostStream(315-322)BifrostError(350-359)RequestType(81-81)ErrorField(368-375)BifrostContextKeySelectedKeyID(107-107)BifrostContextKeySelectedKeyName(108-108)BifrostErrorExtraFields(417-421)core/schemas/account.go (1)
Key(8-17)core/utils.go (1)
IsStreamRequestType(190-192)
core/bifrost_test.go (7)
core/schemas/bifrost.go (2)
OpenAI(35-35)BifrostConfig(21-29)core/bifrost.go (1)
Init(86-214)core/schemas/account.go (1)
Account(53-69)core/schemas/logger.go (2)
Logger(28-55)LogLevelError(13-13)core/logger.go (1)
NewDefaultLogger(40-49)core/schemas/chatcompletions.go (3)
BifrostChatRequest(11-18)ChatMessage(346-355)ChatMessageContent(358-361)core/schemas/provider.go (1)
Provider(207-234)
core/providers/gemini/models.go (2)
core/schemas/models.go (1)
Model(47-66)core/providers/gemini/types.go (1)
GeminiModel(1255-1269)
core/providers/vertex/models.go (1)
core/schemas/models.go (1)
Model(47-66)
core/providers/bedrock/chat.go (1)
core/schemas/chatcompletions.go (1)
ChatAssistantMessageToolCall(483-488)
🔇 Additional comments (16)
core/providers/bedrock/chat.go (2)
90-91: LGTM! Proper string pointer initialization for tool calls.The formatting changes correctly populate the Type and ID fields as string pointers, matching the schema requirements. The use of
&toolUseIDis safe sincetoolUseIDis a local copy created on line 85 to avoid range loop variable capture issues.
188-188: LGTM! Standard variable declaration.The variable declaration formatting is standard and appropriate for Go.
core/providers/azure/types.go (1)
20-30: No-op struct formatting changeOnly field alignment on
AzureModel.CreatedAt; tags and types are unchanged, so there’s no behavioral impact. Looks good as-is.core/providers/anthropic/chat.go (1)
704-732: Comment wording/whitespace tweak is accurateThe updated comment above the
AnthropicStreamEventTypeContentBlockStartbranch still correctly describes emitting tool-call metadata; no code behavior change and the comment remains consistent with the logic.core/providers/cohere/models.go (1)
10-12: Slice init formatting only
Data: make([]schemas.Model, 0, len(response.Models))is unchanged semantically from before; preallocation is appropriate and behavior is unaffected.core/providers/gemini/chat.go (1)
70-80: Tool call struct literal spacing onlyThe
Geminitool-call construction (ID and Type fields) is unchanged in value and behavior; this is purely a formatting tweak and is consistent with the rest of the file.core/providers/cohere/chat.go (1)
271-278: Formatting-only change in tool call mappingThe
bifrostToolCallliteral still mapsIDand function fields exactly as before; this is a whitespace-only adjustment with no behavioral impact.core/providers/gemini/gemini.go (1)
490-499: Context cancellation watcher for SpeechStream looks correctThe done-channel + watcher goroutine cleanly forces
resp.CloseBodyStream()onctx.Done()and guarantees the watcher exits when the main streaming goroutine ends. This should unblock the scanner without introducing leaks.core/providers/cohere/cohere.go (2)
411-420: ChatCompletionStream cancellation watcher is well-scopedThe added watcher cleanly unblocks the streaming scanner on context cancellation and terminates via the
donechannel, so workers shouldn’t leak even if callers drop the context mid-stream.
634-643: ResponsesStream cancellation watcher mirrors ChatCompletionStreamSame pattern and guarantees as in
ChatCompletionStream; it looks sound and consistent with the rest of the codebase.core/providers/gemini/models.go (1)
15-16: Formatting-only struct literal changesThe adjustments to slice literal alignment are cosmetic; capacities and types are unchanged, so runtime behavior is identical.
Also applies to: 42-43
core/providers/anthropic/anthropic.go (2)
469-478: Anthropic chat streaming now correctly unblocks on context cancellationThe watcher goroutine plus
donechannel ensure the BodyStream is closed whenctx.Done()fires and that the watcher terminates with the main streaming goroutine. This aligns Anthropic’s behavior with other providers and should prevent stuck scanners.
735-744: Anthropic ResponsesStream adopts the same safe cancellation patternSame watcher pattern as in
HandleAnthropicChatCompletionStreaming, providing consistent cancellation behavior across Anthropic streaming APIs.core/providers/vertex/models.go (1)
10-12: Slice preallocation formatting only
Datastill preallocates tolen(response.Models); behavior unchanged and correct.core/providers/openai/types.go (1)
128-134: Struct field whitespace cleanup is safe
Createdkeeps the same pointer type and JSON tag; no impact on encoding/decoding.core/bifrost_test.go (1)
3-13: New test imports are appropriate and scoped
runtimeandtestify/assertare used only for the goroutine-leak test; import block remains tidy.
|
Hey @pjcdawkins, thanks for the PR! It’s looking great. There are a few solid Code Rabbit comments to address though. And while you’re at it, please add your update to |
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
…extraction (maximhq#789) ## Summary Enhance changelog generation by adding version information to update labels in the Mintlify documentation. ## Changes - Renamed `get_changelog_content` to `get_file_content` to better reflect its more general purpose - Added support for reading version files and trimming whitespace/newlines from version content - Modified the `append_section` function to include version information from a separate version file in the "description" field of Update tags - Added error handling for empty changelog content - Improved file existence checking to prevent errors when files don't exist ## Type of change - [x] Feature - [x] Refactor ## Affected areas - [x] Docs - [x] Chore/CI ## How to test Run the changelog generation workflow manually: ```sh # Test the script with a sample changelog and version file mkdir -p test/changelog echo "- New feature added" > test/changelog/changelog.md echo "1.2.3" > test/changelog/version .github/workflows/scripts/push-mintlify-changelog.sh ``` ## Related issues Improves documentation automation by including version information in changelog entries. ## Checklist - [x] I verified the CI pipeline passes locally if applicable - [x] I updated documentation where needed
This commit addresses two critical concurrency issues identified in the PR review: 1. Worker exit on context cancellation leaving queued requests without responses: - Reverted worker loop to simple `for req := range queue` pattern - Workers now only exit when queue is closed, ensuring all queued requests are processed - This matches the pre-PR implementation and prevents dropped requests during shutdown 2. Streaming plugin pipeline data race: - Removed pipeline release for streaming requests to prevent data race - Pipeline is acquired once and captured in postHookRunner closure - Pipeline will be garbage collected when no longer referenced - Trade-off: pipelines are not reused for streaming requests, but data race is avoided Additional improvements: - Improved code comments in Shutdown() method for clarity - Maintained original shutdown sequence: cancel context first (stops provider goroutines), then close queues (signals workers), then wait for workers to drain and exit Reference: maximhq#829 (comment) Reference: maximhq#829 (comment) Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
core/changelog.md (1)
1-1: Changelog entry accurately captures the fix.The entry correctly documents the goroutine leak fix across both the worker loop and streaming handlers, addressing the PR objectives. The concise format is appropriate for a changelog.
Minor enhancement: Consider mentioning "context cancellation" for clarity, since that's the mechanism driving the fix—e.g., "fix: Goroutine leaks during context cancellation in worker loop and streaming request handlers". This helps users understand the specific scenario that was fixed.
Workers were blocking on queue channel receive when the context was cancelled, causing goroutine leaks that could persist beyond the context timeout. This fix ensures workers exit promptly when the context is cancelled by: - Listening for context cancellation in a select statement - Draining any remaining queued requests and sending cancellation errors - Preventing callers from hanging indefinitely waiting for responses This addresses the scenario where context timeout occurs without explicit Shutdown() being called, which was causing workers to remain blocked waiting for the queue to be closed. Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
core/bifrost.go (1)
2564-2569: Document expected Shutdown usage to avoid race condition.The shutdown sequence (cancel context → close queues → wait for workers) is intentional to signal streaming goroutines to unblock. However, this creates a brief window where new requests could be enqueued after context cancellation but before queues are closed.
Recommend documenting that
Shutdown()should only be called when no new requests are actively being submitted:-// Shutdown gracefully stops all workers when triggered. -// It closes all request channels and waits for workers to exit. +// Shutdown gracefully stops all workers when triggered. +// It cancels the bifrost context to signal streaming goroutines to stop, +// closes all request channels, and waits for workers to exit. +// +// Important: Shutdown should be called only when no new requests are being +// submitted to avoid race conditions where requests could be enqueued after +// context cancellation but before queue closure. func (bifrost *Bifrost) Shutdown() {This clarifies the contract for callers.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
core/bifrost.go(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
core/bifrost.go (4)
core/schemas/bifrost.go (9)
BifrostError(350-359)ErrorField(368-375)BifrostRequest(140-150)BifrostResponse(243-253)BifrostStream(315-322)RequestType(81-81)BifrostContextKeySelectedKeyID(107-107)BifrostContextKeySelectedKeyName(108-108)BifrostErrorExtraFields(417-421)core/schemas/provider.go (4)
Provider(207-234)ProviderConfig(161-169)CustomProviderConfig(142-148)PostHookRunner(204-204)core/schemas/account.go (1)
Key(8-17)core/utils.go (1)
IsStreamRequestType(190-192)
🔇 Additional comments (2)
core/bifrost.go (2)
2102-2131: Plugin pipeline pooling disabled for streaming to avoid data race (acknowledged trade-off).The code correctly avoids releasing the plugin pipeline for streaming requests because provider goroutines asynchronously invoke
postHookRunner, which would access a pipeline that might have been reset and reused by another request.This addresses the critical data race flagged in the past review comments. The trade-off (no pooling for streaming) is reasonable given that correctness is prioritized over efficiency.
Note: Streaming requests will allocate a new pipeline for each request instead of reusing from the pool. Under high streaming load, this may increase GC pressure, but it ensures thread safety.
2139-2176: Excellent defensive programming with context-aware sends.The context-aware response/error/stream sends with timeouts (lines 2140-2176) effectively prevent workers from blocking indefinitely when clients disconnect or stop listening. This ensures that workers can be reclaimed even if clients misbehave.
The 5-second timeout is a reasonable balance between accommodating slow clients and preventing indefinite blocking.
…nts (maximhq#815) ## Summary Enhances the Responses API streaming implementation to better align with OpenAI's streaming format, providing more consistent and reliable tool call streaming across providers. ## Changes - Refactored Anthropic's Responses API streaming to use a stateful accumulator for tracking output items - Added similar streaming accumulators to Bedrock and Cohere providers - Implemented proper OpenAI-style lifecycle events (created, in_progress, completed) for all providers - Improved tool call streaming with proper output indexing and argument accumulation - Replaced the chat-to-responses conversion approach with native implementations for each provider - Added comprehensive tests for tool call streaming functionality ## Type of change - [ ] Bug fix - [x] Feature - [x] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [ ] Transports (HTTP) - [x] Providers/Integrations - [ ] Plugins - [ ] UI (Next.js) - [ ] Docs ## How to test Test the Responses API streaming with tool calls across different providers: ```sh # Run the tool calls streaming tests go test ./tests/core-providers -run TestOpenAI/ToolCallsStreamingResponses go test ./tests/core-providers -run TestAnthropic/ToolCallsStreamingResponses go test ./tests/core-providers -run TestBedrock/ToolCallsStreamingResponses go test ./tests/core-providers -run TestCohere/ToolCallsStreamingResponses ``` ## Breaking changes - [ ] Yes - [x] No ## Related issues Improves the reliability of tool call streaming in the Responses API. ## Security considerations No security implications. ## Checklist - [x] I added/updated tests where appropriate - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable
) ## Summary Added model pricing data support to the Bifrost core, enabling automatic pricing information in model listings and providing a mechanism to sync pricing data from external sources. ## Changes - Added a `pricingData` field to the Bifrost struct to store pricing information for models - Created new methods to set and retrieve pricing data for models - Added a `DataSheetPricingEntry` struct to represent detailed pricing information - Enhanced `BifrostListModelsResponse` with an `AddPricing` method to include pricing in model listings - Integrated pricing data with the model catalog framework - Added a callback mechanism to update pricing data when synced from external sources - Added `DeploymentName` field to the Model struct ## Type of change - [x] Feature - [x] Refactor ## Affected areas - [x] Core (Go) - [x] Transports (HTTP) ## How to test ```sh # Core/Transports go version go test ./... # Test model listing with pricing data curl -X GET "http://localhost:8000/v1/models" -H "Authorization: Bearer $BIFROST_API_KEY" ``` ## Breaking changes - [x] No ## Related issues Enhances model listing capabilities with pricing information ## Security considerations No security implications as this only adds informational data to model responses. ## Checklist - [x] I added/updated tests where appropriate - [x] I verified builds succeed (Go and UI)
## Summary Enhance test coverage workflow with improved integration testing capabilities and fix AWS Bedrock URL encoding issues. ## Changes - Enhanced test coverage workflow to include UI build, Redis and Weaviate services for integration tests, and plugin rebuilding - Fixed AWS Bedrock URL encoding by properly handling percent-encoded sequences in query parameters - Added GORM logger implementations for configstore and logstore to improve database logging - Updated core-chatbot test module to work with latest Bifrost API changes - Fixed model catalog pricing lookup method naming for consistency ## Type of change - [x] Bug fix - [x] Feature - [x] Refactor - [ ] Documentation - [x] Chore/CI ## Affected areas - [x] Core (Go) - [ ] Transports (HTTP) - [x] Providers/Integrations - [x] Plugins - [x] UI (Next.js) - [ ] Docs ## How to test ```sh # Test AWS Bedrock URL encoding fix go test ./core/providers/bedrock -v # Test the enhanced test coverage workflow cd .github/workflows ./test-coverage.yml # Test GORM logger implementations go test ./framework/configstore -v go test ./framework/logstore -v # Test core-chatbot with latest API changes cd tests/core-chatbot go test -v ``` ## Breaking changes - [ ] Yes - [x] No ## Related issues Fixes AWS Bedrock URL encoding issues with special characters in query parameters. ## Security considerations The AWS Bedrock URL encoding fix improves request signing security by properly handling percent-encoded sequences. ## Checklist - [x] I added/updated tests where appropriate - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable
When the context is cancelled, the monitor goroutine force-closes the response body stream to unblock scanner.Scan(). This causes scanner.Err() to return an error, which was being treated as a real stream error and logged/sent to clients. This commit updates all streaming handlers to check if the context was cancelled before treating scanner errors as failures. When ctx.Err() is not nil, scanner errors are expected and should be ignored. Fixed in providers: - OpenAI: 5 streaming methods - Anthropic: 2 streaming methods - Cohere: 2 streaming methods - Gemini: 2 streaming methods This prevents spurious error logs and error responses when requests timeout or are cancelled. Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
The previous drain loop used a non-blocking select with a default case that could exit prematurely if the queue appeared empty. This created a race condition where requests enqueued between context cancellation and queue closure would be missed. Changed to use blocking receive that continues until the queue is actually closed, ensuring all queued requests receive cancellation errors. Also added req.Context.Done() check to avoid sending errors to clients that already cancelled their requests. Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
The previous approach of calling resp.CloseBodyStream() to unblock scanner.Scan() on context cancellation was causing nil pointer panics in fasthttp's internals. Investigation revealed that fasthttp does not support context cancellation for streaming response bodies. The BodyStream() cannot be safely interrupted once scanner.Scan() is blocking on I/O. This commit removes: - Panic recovery in streaming goroutines - Context monitor goroutines that called CloseBodyStream() - All CloseBodyStream() calls The streaming handlers now rely on: - Context checks between chunks (select case <-ctx.Done()) - Natural TCP connection timeouts for hung streams - Scanner error handling (ignoring errors when context is cancelled) This is a simpler, safer approach. While streams may take longer to clean up if they hang (30-90s TCP timeout vs immediate), this is rare in practice and avoids the complexity and crashes of forced closes. Fixed in providers: - OpenAI: 5 streaming methods - Anthropic: 2 streaming methods - Cohere: 2 streaming methods - Gemini: 2 streaming methods Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Critical bug: Shutdown() was returning early if the context was already cancelled, which prevented queues from being closed and workers from exiting. Scenario that triggered the bug: 1. User passes timeout context to Init(ctx, ...) 2. Timeout expires, bifrost.ctx becomes cancelled 3. Workers detect ctx.Done() and enter drain mode, waiting for queue closure 4. User calls Shutdown() to clean up 5. Shutdown() sees context is already done and returns early 6. Queues never get closed, workers stuck in drain loop forever Fix: - Remove early return when context is already cancelled - Only skip cancel() call if context already cancelled - Always close queues (with safety check to avoid double-close panic) - Always wait for workers to exit - Always run cleanup for MCP manager and plugins This ensures workers in drain mode always receive the queue closure signal they're waiting for, regardless of when Shutdown() is called. Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
…q#833) ## Summary Fixed issues with custom keyless providers during initial list models requests, ensuring proper initialization and handling of providers that don't require API keys. ## Changes - Added a check to skip key retrieval for providers that don't require keys - Implemented auto-initialization of providers when they're not yet initialized but have a valid config - Added support for keyless model listing in Anthropic, Cohere, Gemini, and OpenAI providers - Updated provider implementations to handle keyless configurations properly ## Type of change - [x] Bug fix - [ ] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [x] Transports (HTTP) - [x] Providers/Integrations - [ ] Plugins - [ ] UI (Next.js) - [ ] Docs ## How to test Test listing models with custom keyless providers: ```sh # Core/Transports go version go test ./... # Test with a custom keyless provider configuration curl -X GET "http://localhost:8000/v1/models" -H "Authorization: Bearer your_token" ``` ## Breaking changes - [ ] Yes - [x] No ## Related issues Fixes issues with custom keyless providers failing to initialize or list models properly. ## Security considerations This change maintains the security model by properly handling keyless provider configurations without exposing sensitive information. ## Checklist - [x] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [x] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable
Add configurable stream inactivity timeout to NetworkConfig to allow customization of the timeout duration for detecting and force-closing hung streams. Previously hardcoded to 60 seconds. Changes: - Add StreamInactivityTimeoutInSeconds field to NetworkConfig - Add DefaultStreamInactivityTimeoutInSeconds constant (60 seconds) - Update CheckAndSetDefaults to set default if not provided - Update all streaming methods across providers to use configurable value: - OpenAI: 5 streaming methods - Anthropic: 2 streaming methods (including shared function) - Cohere: 2 streaming methods - Gemini: 2 streaming methods The timeout can now be configured per provider via the NetworkConfig in the provider configuration. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Update all provider calls to shared OpenAI and Anthropic streaming functions to pass the new inactivityTimeoutSeconds parameter. This fixes build errors after adding the configurable timeout parameter to the shared streaming function signatures. Changes: - Update function signatures for shared streaming handlers to accept timeout parameter - Add provider.networkConfig.StreamInactivityTimeoutInSeconds parameter to all streaming function calls across providers: - cerebras, groq, ollama, openrouter, parasail, sgl - azure, mistral, perplexity, gemini, vertex - Fix timeout checks in OpenAI shared functions to use parameter instead of accessing undefined provider variable 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
| return | ||
| } | ||
|
|
||
| // Track last activity time for inactivity timeout detection |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pjcdawkins can you have a common util for this? also lets add a field in the UI as well for stream activity timeout in network config
|
@pjcdawkins lets pls also add the inactivity check in speech and transcription streaming |
|
Good points - I set this as a draft because it's a relatively hard problem and I'll come back to it soon! But it seems important. I think it'll need a few more tests to demonstrate the problem(s) and fixes, and some consolidation as you say. |
|
Yes sure no worries, lmk if you need a hand😄 |
|
also could you rebase this main with our latest main. we had to nuke one file from the history - which had to be done via rewrite. |
Summary
Fixes #828: Workers and streaming goroutines do not exit and leak resources on context cancellation.
This PR addresses two critical goroutine leak issues:
Changes
Worker Loop Fix
Refactored
requestWorkerincore/bifrost.goto monitorbifrost.ctx.Done()using a select statement. Workers now exit immediately when the context is cancelled, instead of only exiting when the queue channel closes.Before:
After:
Streaming Handler Fixes
Added context monitoring goroutines to all streaming handlers that force-close the response body stream when context is cancelled. This unblocks
scanner.Scan()calls that were preventing goroutines from exiting.Pattern applied to all streaming methods:
Fixed in 11 streaming methods across 4 providers:
Testing
Added
TestBifrostContextCancellationCleanupto verify goroutines clean up properly after context timeout.Test Results:
Type of change
Affected areas
How to test
Breaking changes
Checklist
docs/contributing/README.mdand followed the guidelines