feat: responses reasoning fixes #1000

Pratham-Mishra04 · 2025-12-04T13:13:55Z

Summary

Briefly explain the purpose of this PR and the problem it solves.

Changes

What was changed and why
Any notable design decisions or trade-offs

Type of change

Affected areas

How to test

Describe the steps to validate this change. Include commands and expected outcomes.

# Core/Transports
go version
go test ./...

# UI
cd ui
pnpm i || npm i
pnpm test || npm test
pnpm build || npm run build

If adding new configs or environment variables, document them here.

Screenshots/Recordings

If UI changes, add before/after screenshots or short clips.

Breaking changes

Yes
No

If yes, describe impact and migration instructions.

Related issues

Link related issues and discussions. Example: Closes #123

Security considerations

Note any security implications (auth, secrets, PII, sandboxing, etc.).

Checklist

I read docs/contributing/README.md and followed the guidelines
I added/updated tests where appropriate
I updated documentation where needed
I verified builds succeed (Go and UI)
I verified the CI pipeline passes locally if applicable

Pratham-Mishra04 · 2025-12-04T13:14:17Z

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

fix: reroute requests incoming from integrations for full compatibility in custom providers #1030
fix: gemini transcription test cases #1031
feat: added support for gemini native converters #1018
feat: responses reasoning fixes #1000 👈 (View in Graphite)
feat: send back raw request in extra fields #1010
feat: support raw response accumulation in stream accumulator #999
feat: add reasoning support for chat completions #978 : 1 other dependent PR (#979 )
refactor: extract error parsing into dedicated files for providers #977
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

coderabbitai · 2025-12-04T13:14:22Z

Warning

Rate limit exceeded

@Pratham-Mishra04 has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 14 minutes and 17 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 68bd009 and 4276354.

⛔ Files ignored due to path filters (1)

ui/package-lock.json is excluded by !**/package-lock.json

📒 Files selected for processing (28)

core/internal/testutil/account.go (1 hunks)
core/providers/anthropic/chat.go (1 hunks)
core/providers/anthropic/errors.go (2 hunks)
core/providers/anthropic/types.go (4 hunks)
core/providers/bedrock/bedrock_test.go (15 hunks)
core/providers/bedrock/utils.go (1 hunks)
core/providers/cohere/responses.go (9 hunks)
core/providers/gemini/responses.go (2 hunks)
core/providers/openai/chat.go (1 hunks)
core/providers/openai/responses.go (2 hunks)
core/providers/openai/text.go (1 hunks)
core/providers/openai/types.go (3 hunks)
core/providers/openai/utils.go (1 hunks)
core/providers/utils/utils.go (4 hunks)
core/providers/vertex/errors.go (1 hunks)
core/schemas/bifrost.go (1 hunks)
core/schemas/responses.go (5 hunks)
framework/streaming/responses.go (3 hunks)
transports/bifrost-http/handlers/inference.go (1 hunks)
transports/bifrost-http/handlers/middlewares.go (1 hunks)
transports/bifrost-http/integrations/anthropic.go (4 hunks)
transports/bifrost-http/integrations/router.go (3 hunks)
ui/app/workspace/logs/views/columns.tsx (1 hunks)
ui/app/workspace/logs/views/logDetailsSheet.tsx (1 hunks)
ui/app/workspace/logs/views/logResponsesMessageView.tsx (2 hunks)
ui/app/workspace/logs/views/logResponsesOutputView.tsx (0 hunks)
ui/lib/types/logs.ts (2 hunks)
ui/package.json (1 hunks)

📝 Walkthrough

Summary by CodeRabbit

Release Notes

New Features
- Added support for reasoning and thinking content types across multiple providers
- Enhanced streaming response handling with improved lifecycle event support
- Added reasoning parameters display in log details
Bug Fixes
- Improved error handling and decoding across provider integrations
- Fixed tool input delta handling for more reliable tool execution
- Better parameter validation for improved stability
UI Improvements
- Better content rendering with word wrapping for long text
- Enhanced log message visibility and filtering

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Walkthrough

This pull request extends support for reasoning/thinking content blocks across multiple AI providers (Anthropic, OpenAI, Gemini, Cohere, Bedrock) with bidirectional schema conversions, enhanced streaming response handling, improved error handling, user field sanitization, and corresponding UI updates to display reasoning parameters and summary content.

Changes

Cohort / File(s)	Summary
Anthropic Provider `core/providers/anthropic/errors.go`, `core/providers/anthropic/types.go`, `core/providers/anthropic/chat.go`	Added `ToAnthropicResponsesStreamError` for SSE-formatted error streaming; added `AnthropicContentBlockTypeRedactedThinking` constant and `Data` field to content blocks; relaxed guard condition for tool input delta emission to allow empty PartialJSON strings.
OpenAI Provider `core/providers/openai/responses.go`, `core/providers/openai/types.go`, `core/providers/openai/chat.go`, `core/providers/openai/text.go`, `core/providers/openai/utils.go`	Implemented reasoning-aware message filtering and conversion; added custom UnmarshalJSON/MarshalJSON for request serialization; added `MaxUserFieldLength` constant and `SanitizeUserField` function to enforce 64-character user field limits across chat and text completion requests.
Cohere Provider `core/providers/cohere/responses.go`	Added comprehensive bidirectional conversion utilities for tool choices, messages, and content blocks; introduced per-stream reasoning tracking; expanded public API surface for streaming lifecycle events (Created, InProgress, OutputTextDone, etc.).
Gemini Provider `core/providers/gemini/responses.go`	Enhanced function call handling with JSON marshaling guards; added `Summary` field to `ResponsesReasoning` messages; improved ThoughtSignature preservation for Gemini 3 Pro.
Bedrock Provider `core/providers/bedrock/utils.go`, `core/providers/bedrock/bedrock_test.go`, `core/internal/testutil/account.go`	Made inference parameters optional; added Claude 4.5 Haiku deployment mapping; extensively updated tests to validate tool call interleaving, status fields, and payload structures.
Vertex Provider `core/providers/vertex/errors.go`	Enhanced error decoding via `CheckAndDecodeBody` before JSON unmarshalling for improved error reporting.
Core Schemas `core/schemas/responses.go`, `core/schemas/bifrost.go`	Added `StopReason` and `Signature` fields to response structures; renamed `ResponsesReasoningContent` to `ResponsesReasoningSummary`; added `BifrostContextKeyIntegrationType` constant for integration type propagation.
Framework Streaming `framework/streaming/responses.go`	Added `ReasoningSummaryTextDelta` handling with helper methods `appendReasoningDeltaToResponsesMessage` and `appendReasoningSignatureToResponsesMessage`; added nil-safety checks around RawRequest access.
HTTP Transport & Integration `transports/bifrost-http/handlers/inference.go`, `transports/bifrost-http/integrations/anthropic.go`, `transports/bifrost-http/integrations/router.go`, `transports/bifrost-http/handlers/middlewares.go`	Added custom UnmarshalJSON for ResponsesRequest; enhanced Anthropic streaming with multi-event SSE aggregation; added integration type context propagation and conditional DONE marker emission for Anthropic routes; minor whitespace cleanup.
Frontend UI `ui/lib/types/logs.ts`, `ui/app/workspace/logs/views/logResponsesMessageView.tsx`, `ui/app/workspace/logs/views/logResponsesOutputView.tsx`, `ui/app/workspace/logs/views/logDetailsSheet.tsx`, `ui/app/workspace/logs/views/columns.tsx`	Renamed `ResponsesReasoningContent` interface to `ResponsesReasoningSummary`; removed LogResponsesMessageView export; added early return for empty reasoning messages; added Reasoning Parameters section to detail sheet; changed transcription audio message fallback behavior; updated break-words styling.
Utilities & Dependencies `core/providers/utils/utils.go`, `ui/package.json`	Added `GetRandomString` function; enhanced `HandleProviderAPIError` with body decoding; switched JSON formatting to indented output via `sonic.MarshalIndent`; updated lucide-react dependency from exact to caret version.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45–75 minutes

Areas requiring extra attention:

Bidirectional conversion logic in core/providers/cohere/responses.go — extensive helper functions with multiple content block and tool call mappings across schemas
Streaming response semantics across providers — verify reasoning delta/signature handling and event ordering in framework/streaming/responses.go and provider-specific paths
Request serialization changes in OpenAI (types.go, responses.go) — custom marshal/unmarshal logic and conditional message filtering based on reasoning state
Context propagation and routing in transports/bifrost-http/integrations/router.go — interaction between DONE marker suppression, integration type context, and Anthropic-specific paths
Bedrock test coverage expansion — validate that tool call interleaving and status transitions are correctly exercised in bedrock_test.go
UI component removal of logResponsesOutputView.tsx — verify that rendering logic is safely migrated or no longer needed in dependent views

Poem

🐰 Hops through reasoning blocks with glee,
Streaming deltas flowing wild and free,
From OpenAI to Cohere's embrace,
Signatures and summaries find their place,
Bidirectional magic, converters unite,
This pull makes the thoughts shine bright! ✨

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings, 2 inconclusive)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The PR description is entirely a template with no concrete information filled in; all sections lack actual details about changes, motivations, testing steps, or related issues.	Complete the PR description with actual content: explain what was changed, why changes were made, which areas were affected, specific test commands, and link related issues.
Linked Issues check	⚠️ Warning	The linked issue #123 is about Files API support, which is completely unrelated to the changeset focused on responses reasoning, streaming deltas, content block signatures, and provider integrations.	Link the correct issues related to responses reasoning implementation or remove the unrelated #123 issue link and add appropriate issue references.
Title check	❓ Inconclusive	The title 'feat: responses reasoning fixes' is vague and lacks specific detail about what was actually fixed or implemented in the responses reasoning functionality.	Provide a more specific title that clearly describes the specific reasoning fixes (e.g., 'feat: add reasoning summary support and content block signatures to responses').
Out of Scope Changes check	❓ Inconclusive	Numerous changes appear unrelated to reasoning fixes: Bedrock test updates, dependency version change (lucide-react), whitespace cleanup, transcription logic changes, and various provider-specific logic (not strictly reasoning-focused).	Clarify in the PR description which changes are in-scope for reasoning fixes and whether auxiliary changes (tests, dependencies, UI) are intentional or should be separated into different PRs.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Docstring Coverage	✅ Passed	Docstring coverage is 85.00% which is sufficient. The required threshold is 80.00%.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (9)

transports/bifrost-http/integrations/anthropic.go (1)

72-77: Non‑stream Anthropic /v1/messages no longer supports raw‑response passthrough; consider confirming intent and cleaning up commented code

The previous behavior short‑circuited to resp.ExtraFields.RawResponse for Anthropic if present; now that path is commented out and we always go through anthropic.ToAnthropicResponsesResponse(resp). This is a real behavior change for non‑stream responses:

If any callers relied on getting the provider’s raw body for /v1/messages (non‑stream), they will now receive the normalized Anthropic struct instead.

In streaming, we still support raw passthrough gated by BifrostContextKeySendBackRawResponse, so behavior is now asymmetric between streaming vs non‑stream.

If the intent is to fully normalize non‑stream responses (e.g., to ensure reasoning metadata is always passed through via our schemas), this looks fine functionally, but I’d suggest:

Remove the commented block to avoid dead code, and

Optionally add a brief comment above the converter clarifying that non‑stream Anthropic responses are intentionally always normalized and that raw passthrough is streaming‑only.

If, instead, non‑stream raw passthrough is still desired in some cases, we probably want to reintroduce this logic but gate it similarly to streaming using BifrostContextKeySendBackRawResponse for consistency.

core/providers/utils/utils.go (1)

951-960: No urgent security fix needed for this use case, but consider documenting the function's non-security purpose.

GetRandomString is used only for generating internal message IDs in Anthropic response parsing (with prefixes like msg_ and rs_), not for authentication tokens or security-sensitive identifiers. While math/rand is indeed not cryptographically secure, the current implementation is appropriate for internal message tracking.

If you want to prevent future misuse, add a doc comment clarifying this is not suitable for security-sensitive purposes. Alternatively, if you're concerned about consistency with Go best practices for all random generation, using crypto/rand is defensible but not critical for this use case.
core/providers/openai/responses.go (2)
57-59: Duplicate condition check.

Line 57 and line 59 both check len(message.ResponsesReasoning.Summary) > 0. This is redundant.
 			// If the message has summaries but no content blocks and the model is gpt-oss, then convert the summaries to content blocks
 			if len(message.ResponsesReasoning.Summary) > 0 &&
 				strings.Contains(bifrostReq.Model, "gpt-oss") &&
-				len(message.ResponsesReasoning.Summary) > 0 &&
 				message.Content == nil {
45-84: Consider extracting model-specific reasoning logic to improve readability.

The nested conditionals handling reasoning content transformation are complex. The logic correctly handles:

Skipping reasoning blocks without summaries for non-gpt-oss models

Converting summaries to content blocks for gpt-oss models

Passing through other messages unchanged

However, using strings.Contains(bifrostReq.Model, "gpt-oss") for model detection may be fragile if model naming conventions change.

Consider extracting a helper function like isGptOssModel(model string) bool for clearer intent and easier maintenance:
func isGptOssModel(model string) bool {
    return strings.Contains(model, "gpt-oss")
}
This would make the conditional checks more readable and centralize the model detection logic.
core/providers/gemini/responses.go (2)
143-146: Consider using sonic.Marshal for consistency.

This uses encoding/json.Marshal while the rest of the codebase uses github.com/bytedance/sonic for JSON operations. For consistency and potential performance benefits, consider using sonic.Marshal here.
-				if argsBytes, err := json.Marshal(part.FunctionCall.Args); err == nil {
+				if argsBytes, err := sonic.Marshal(part.FunctionCall.Args); err == nil {
 					argumentsStr = string(argsBytes)
 				}
You would also need to add the sonic import if not already present via another code path.

263-272: Duplicate ID generation logic.

Lines 264-267 and 269-271 contain duplicate logic for generating itemID. The second block (269-271) appears to be redundant as it only handles the MessageID == nil case which is already covered.
 			// Generate stable ID for text item
 			var itemID string
 			if state.MessageID == nil {
 				itemID = fmt.Sprintf("item_%d", outputIndex)
 			} else {
 				itemID = fmt.Sprintf("msg_%s_item_%d", *state.MessageID, outputIndex)
 			}
-			if state.MessageID == nil {
-				itemID = fmt.Sprintf("item_%d", outputIndex)
-			}
 			state.ItemIDs[outputIndex] = itemID
core/providers/cohere/responses.go (1)
263-272: Duplicate ID generation pattern repeated multiple times.

The same ID generation logic with the redundant second if block appears in multiple places (lines 263-272, 306-316, and 421-429). This appears to be copy-paste duplication.

Consider extracting a helper function and removing the duplicate conditional:
func (state *CohereResponsesStreamState) generateItemID(outputIndex int, prefix string) string {
    if state.MessageID == nil {
        return fmt.Sprintf("%s_%d", prefix, outputIndex)
    }
    return fmt.Sprintf("msg_%s_%s_%d", *state.MessageID, prefix, outputIndex)
}
Then use it consistently:
itemID := state.generateItemID(outputIndex, "item")
state.ItemIDs[outputIndex] = itemID
Also applies to: 306-316, 421-429
transports/bifrost-http/handlers/inference.go (2)
224-254: Custom ResponsesRequest unmarshal aligns with chat pattern; consider guarding against reuse-side effects

The split unmarshal (BifrostParams → Input union → ResponsesParameters) looks correct and mirrors the ChatRequest.UnmarshalJSON pattern, so it should resolve the embedded‐struct issues with ResponsesParameters’ custom unmarshaller.

If you ever end up reusing a ResponsesRequest instance for multiple decodes, this implementation can leave stale values in fields that are omitted in subsequent payloads (standard encoding/json behavior, but now under your control). It’s not a problem for the current usage (fresh var req ResponsesRequest per request), but if you want stricter reset semantics you could zero the struct at the start of the method before re-populating it.

91-118: responsesParamsKnownFields omits "user"; likely ends up duplicated in ExtraParams

ResponsesParameters has a User *string \json:"user,omitempty"`, but "user"is not listed inresponsesParamsKnownFields. That means /v1/responsesrequests with auserfield will both populateResponsesParameters.User(viasonic.Unmarshal) and also be treated as an unknown extra param and forwarded in ExtraParams. This is inconsistent with the chat path (where "user"is marked as known) and could cause confusing duplication for provider adapters that look atExtraParams`.

If user is intended to be a first-class, schema-level field for responses (same as chat), consider adding it here so it is not treated as a provider-specific extra:
 var responsesParamsKnownFields = map[string]bool{
   "model":                true,
   "input":                true,
   "fallbacks":            true,
   "stream":               true,
@@
   "top_p":                true,
   "tool_choice":          true,
   "tools":                true,
-  "truncation":           true,
+  "truncation":           true,
+  "user":                 true,
 }

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6cf3108 and a15c48b.

📒 Files selected for processing (12)

core/providers/anthropic/errors.go (2 hunks)
core/providers/anthropic/types.go (2 hunks)
core/providers/cohere/responses.go (4 hunks)
core/providers/gemini/responses.go (2 hunks)
core/providers/openai/openai.go (1 hunks)
core/providers/openai/responses.go (2 hunks)
core/providers/openai/types.go (2 hunks)
core/providers/utils/utils.go (3 hunks)
core/schemas/responses.go (3 hunks)
transports/bifrost-http/handlers/inference.go (1 hunks)
transports/bifrost-http/integrations/anthropic.go (1 hunks)
ui/lib/types/logs.ts (2 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

core/providers/anthropic/errors.go
transports/bifrost-http/handlers/inference.go
core/providers/openai/types.go
ui/lib/types/logs.ts
core/providers/anthropic/types.go
core/providers/openai/responses.go
transports/bifrost-http/integrations/anthropic.go
core/providers/utils/utils.go
core/providers/gemini/responses.go
core/providers/openai/openai.go
core/schemas/responses.go
core/providers/cohere/responses.go

🧬 Code graph analysis (6)

core/providers/anthropic/errors.go (2)

core/providers/anthropic/types.go (3)

AnthropicStreamEvent (328-337)

AnthropicStreamEventTypeError (324-324)

AnthropicStreamError (399-402)

core/providers/anthropic/responses.go (1)

ToAnthropicResponsesStreamError (1392-1413)

transports/bifrost-http/handlers/inference.go (2)

core/schemas/bifrost.go (1)

ResponsesRequest (91-91)

core/schemas/responses.go (1)

ResponsesParameters (85-112)

core/providers/openai/types.go (1)

core/schemas/responses.go (1)

ResponsesParametersReasoning (232-237)

ui/lib/types/logs.ts (1)

core/schemas/responses.go (2)

ResponsesReasoningSummary (743-746)

ResponsesReasoning (729-732)

core/providers/gemini/responses.go (3)

ui/lib/types/logs.ts (6)

FunctionCall (157-160)

ResponsesToolMessage (402-408)

ResponsesMessage (422-437)

ResponsesMessageContent (399-399)

ResponsesReasoning (416-419)

ResponsesReasoningSummary (411-414)

core/providers/gemini/types.go (4)

FunctionCall (1091-1101)

Role (13-13)

Content (922-930)

Type (778-778)

core/schemas/responses.go (5)

ResponsesToolMessage (460-480)

ResponsesMessage (312-325)

ResponsesMessageContent (337-342)

ResponsesReasoning (729-732)

ResponsesReasoningSummary (743-746)

core/schemas/responses.go (2)

core/providers/gemini/types.go (1)

Type (778-778)

ui/lib/types/logs.ts (2)

ResponsesMessageContentBlockType (351-358)

ResponsesReasoningSummary (411-414)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)

GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check

🔇 Additional comments (13)

core/providers/anthropic/types.go (2)

143-144: LGTM! Proper addition of redacted thinking support.

The new redacted_thinking content block type aligns with Anthropic's extended thinking API, which returns encrypted thinking content when the model's reasoning needs to be redacted. This is consistent with the Data field added below for storing the encrypted content.

153-153: Clear documentation for the Data field.

The comment accurately describes the field's purpose for encrypted redacted thinking data.

core/providers/openai/types.go (1)

154-192: Well-structured custom marshaling with clear intent.

The implementation correctly:

Preserves custom Input marshaling via json.RawMessage

Clones Reasoning to avoid mutating the original struct

Always sets MaxTokens to nil as OpenAI's API doesn't support this parameter

One minor note: the code mixes encoding/json (for RawMessage) with sonic (for final marshal), which is intentional but could be documented.

core/providers/openai/responses.go (1)

41-94: Logic correctly handles reasoning content transformation.

The overall transformation logic for handling reasoning content blocks across different OpenAI model variants is sound. The approach of building a new messages slice while selectively transforming or skipping messages based on model capabilities is appropriate.

core/providers/anthropic/errors.go (1)

36-58: The function ToAnthropicResponsesStreamError exists only in core/providers/anthropic/errors.go and is not duplicated elsewhere in the codebase. There is no duplicate function definition in responses.go or any other file. This code can be merged without compilation errors related to duplication.

Likely an incorrect or invalid review comment.

ui/lib/types/logs.ts (1)

411-419: Type rename looks good and aligns with Go schema.

The renaming from ResponsesReasoningContent to ResponsesReasoningSummary is consistent with the corresponding changes in core/schemas/responses.go (lines 742-745). The field structure matches the Go definition.

core/schemas/responses.go (2)

398-401: New Signature field addition looks correct.

The Signature field is appropriately added as an optional pointer field for carrying content signatures (used for reasoning in Gemini 3 Pro). The JSON tag with omitempty is correct for optional fields.

728-746: Type rename and structure updates are consistent.

The ResponsesReasoning struct now uses []ResponsesReasoningSummary for the Summary field, and the new ResponsesReasoningSummary type is properly defined with Type and Text fields. This aligns with the corresponding TypeScript types in ui/lib/types/logs.ts.

core/providers/gemini/responses.go (3)

148-164: Good fix for range loop variable capture issue.

Creating copies of functionCallID and functionCallName before using them in pointer assignments correctly avoids the Go range loop variable capture issue. This is a proper fix for Go versions prior to 1.22.

166-179: Thought signature preservation looks correct.

The logic correctly creates a separate ResponsesReasoning message when a thought signature is present, using an empty Summary slice and the encrypted content. This aligns with the updated schema and supports Gemini 3 Pro requirements.

619-627: Look-ahead logic assumes reasoning message immediately follows function call.

The look-ahead for thought signature assumes the reasoning message is at index i+1. This may not handle cases where messages are reordered or there are intervening messages. Consider documenting this assumption or adding validation.

Verify that the message ordering convention (reasoning message immediately after function call) is consistently maintained across all code paths that produce these messages.

core/providers/cohere/responses.go (2)

162-765: Streaming conversion implementation is comprehensive.

The ToBifrostResponsesStream method handles the full OpenAI-style streaming lifecycle (created, in_progress, output_item.added, deltas, output_item.done, completed) with proper state management. The tool call argument accumulation and tool plan lifecycle handling appear correct.

894-1029: Message conversion logic handles reasoning blocks correctly.

The ConvertBifrostMessagesToCohereMessages function properly accumulates pending reasoning blocks and attaches them to assistant messages. The system message extraction and prepending logic is also correct.

core/providers/cohere/responses.go

core/providers/openai/openai.go

core/providers/utils/utils.go

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (4)

core/providers/utils/utils.go (1)

267-267: Previous review comment still applies.

The past review already flagged this change from sonic.Marshal to sonic.MarshalIndent. The concern about increased payload size for production API requests and the associated debug prints in openai.go remains valid. Please address the feedback from the previous review.
core/providers/cohere/responses.go (3)
148-153: Returning empty struct for invalid image block remains unaddressed.

When ImageURL is nil, an empty ResponsesMessageContentBlock{} is returned with a zero-value Type field, which could cause unexpected behavior downstream when processing content blocks.

Consider one of the previously suggested approaches:

Return a text block indicating the missing image

Return (schemas.ResponsesMessageContentBlock, bool) to indicate validity

Skip invalid blocks at the call site

Based on learnings, this issue was previously flagged but not yet addressed.

1131-1142: Tool choice "auto" mapping to "required" remains semantically incorrect.

Line 1136 maps "auto" to ToolChoiceRequired, which changes the semantic meaning. In the Responses API, "auto" means the model decides whether to call a tool, while "required" forces a tool call.

Please verify Cohere's tool choice options and update the mapping:
#!/bin/bash
# Search for Cohere tool choice type definitions and usage
ast-grep --pattern 'type CohereToolChoice $$$'
ast-grep --pattern 'ToolChoice$_ CohereToolChoice = $_'
Based on learnings, this issue was previously flagged but not yet addressed.

1216-1225: Encrypted reasoning content exposure in plain text marker remains unaddressed.

Lines 1219-1224 wrap encrypted content in a plain text marker [ENCRYPTED_REASONING: ...], exposing the encrypted content in an unprotected format. This defeats the purpose of encryption if the content is meant to remain opaque.

Consider skipping encrypted content entirely since Cohere doesn't support it:
 		} else if msg.ResponsesReasoning.EncryptedContent != nil {
-			// Cohere doesn't have a direct equivalent to encrypted content,
-			// so we'll store it as a regular thinking block with a special marker
-			encryptedText := fmt.Sprintf("[ENCRYPTED_REASONING: %s]", *msg.ResponsesReasoning.EncryptedContent)
-			thinkingBlock := CohereContentBlock{
-				Type:     CohereContentBlockTypeThinking,
-				Thinking: &encryptedText,
-			}
-			thinkingBlocks = append(thinkingBlocks, thinkingBlock)
+			// Skip encrypted content as Cohere doesn't support it
+			// The encrypted content should remain opaque and not be sent to other providers
 		}
Based on learnings, this issue was previously flagged but not yet addressed.

🧹 Nitpick comments (5)

transports/bifrost-http/integrations/anthropic.go (3)
74-78: Remove commented-out code.

This commented-out code block should either be removed or documented with a TODO/reason for keeping it. Leaving dead code in comments reduces readability and maintainability.
 			ResponsesResponseConverter: func(ctx *context.Context, resp *schemas.BifrostResponsesResponse) (interface{}, error) {
-				// if resp.ExtraFields.Provider == schemas.Anthropic {
-				// 	if resp.ExtraFields.RawResponse != nil {
-				// 		return resp.ExtraFields.RawResponse, nil
-				// 	}
-				// }
 				return anthropic.ToAnthropicResponsesResponse(resp), nil
 			},
94-97: Use the injected logger instead of stdlib log.

The AnthropicRouter is initialized with a schemas.Logger (line 246), but this closure uses the stdlib log.Printf. This inconsistency means errors logged here won't go through the configured logging infrastructure.

Consider passing the logger through the route config or using a package-level logger that can be configured.

103-117: Remove large commented-out code block.

This 15-line commented block should be removed. If this logic might be needed in the future, document the intent in a TODO or track it in an issue rather than leaving dead code.
 					} else {
-						// if resp.ExtraFields.Provider == schemas.Anthropic ||
-						// 	(resp.ExtraFields.Provider == schemas.Vertex &&
-						// 		(schemas.IsAnthropicModel(resp.ExtraFields.ModelRequested) ||
-						// 			schemas.IsAnthropicModel(resp.ExtraFields.ModelDeployment))) {
-						// 	// This is always true in integrations
-						// 	isRawResponseEnabled, ok := (*ctx).Value(schemas.BifrostContextKeySendBackRawResponse).(bool)
-						// 	if ok && isRawResponseEnabled {
-						// 		if resp.ExtraFields.RawResponse != nil {
-						// 			return string(anthropicResponse[len(anthropicResponse)-1].Type), resp.ExtraFields.RawResponse, nil
-						// 		} else {
-						// 			// Explicitly return nil to indicate that no raw response is available (because 1 chunk of anthropic gets converted to multiple bifrost responses chunks)
-						// 			return "", nil, nil
-						// 		}
-						// 	}
-						// }
 						return string(anthropicResponse[0].Type), anthropicResponse[0], nil
 					}
core/providers/utils/utils.go (1)
950-960: Consider using a package-level random source for better performance.

The current implementation creates a new rand.Source on every call, which is inefficient. However, the collision risk from time.Now().UnixNano() seeding is minimal in practice since GetRandomString is used for generating message IDs in response processing (not in tight loops where nanosecond collisions would occur).

For non-security use cases like message identification, consider a simpler optimization using a package-level source with synchronization:
var (
	randMu  sync.Mutex
	randSrc = rand.New(rand.NewSource(time.Now().UnixNano()))
)

func GetRandomString(length int) string {
	letters := []rune("abcdefghijklmnopqrstuvwxyz0123456789")
	b := make([]rune, length)
	randMu.Lock()
	for i := range b {
		b[i] = letters[randSrc.Intn(len(letters))]
	}
	randMu.Unlock()
	return string(b)
}
This avoids repeated allocations without the complexity of crypto/rand.
core/providers/openai/responses.go (1)
56-81: Duplicate condition on line 59.

The condition len(message.ResponsesReasoning.Summary) > 0 is checked twice in the same if statement at lines 57 and 59.

Apply this diff to remove the redundant check:
 			// If the message has summaries but no content blocks and the model is gpt-oss, then convert the summaries to content blocks
 			if len(message.ResponsesReasoning.Summary) > 0 &&
 				strings.Contains(bifrostReq.Model, "gpt-oss") &&
-				len(message.ResponsesReasoning.Summary) > 0 &&
 				message.Content == nil {

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a15c48b and d4bfce4.

📒 Files selected for processing (13)

core/providers/anthropic/errors.go (2 hunks)
core/providers/anthropic/types.go (3 hunks)
core/providers/cohere/responses.go (9 hunks)
core/providers/gemini/responses.go (2 hunks)
core/providers/openai/openai.go (1 hunks)
core/providers/openai/responses.go (2 hunks)
core/providers/openai/types.go (2 hunks)
core/providers/utils/utils.go (3 hunks)
core/schemas/responses.go (5 hunks)
framework/streaming/responses.go (2 hunks)
transports/bifrost-http/handlers/inference.go (1 hunks)
transports/bifrost-http/integrations/anthropic.go (3 hunks)
ui/lib/types/logs.ts (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

core/providers/openai/openai.go
transports/bifrost-http/handlers/inference.go
core/providers/anthropic/types.go

🧰 Additional context used

📓 Path-based instructions (1)

**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

ui/lib/types/logs.ts
core/providers/anthropic/errors.go
core/providers/openai/types.go
core/providers/utils/utils.go
transports/bifrost-http/integrations/anthropic.go
core/schemas/responses.go
core/providers/gemini/responses.go
core/providers/openai/responses.go
core/providers/cohere/responses.go
framework/streaming/responses.go

🧬 Code graph analysis (7)

ui/lib/types/logs.ts (1)

core/schemas/responses.go (2)

ResponsesReasoningSummary (744-747)

ResponsesReasoning (730-733)

core/providers/anthropic/errors.go (2)

core/providers/anthropic/types.go (3)

AnthropicStreamEvent (328-337)

AnthropicStreamEventTypeError (324-324)

AnthropicStreamError (399-402)

core/providers/anthropic/responses.go (1)

ToAnthropicResponsesStreamError (1392-1413)

core/providers/openai/types.go (1)

core/schemas/responses.go (1)

ResponsesParametersReasoning (233-238)

transports/bifrost-http/integrations/anthropic.go (1)

core/providers/gemini/types.go (1)

Type (778-778)

core/providers/gemini/responses.go (2)

core/providers/gemini/types.go (5)

FunctionCall (1091-1101)

Role (13-13)

Content (922-930)

Type (778-778)

Part (936-960)

core/schemas/responses.go (8)

ResponsesToolMessage (461-481)

ResponsesMessage (313-326)

ResponsesInputMessageRoleAssistant (331-331)

ResponsesMessageContent (338-343)

ResponsesMessageTypeFunctionCall (294-294)

ResponsesMessageTypeReasoning (306-306)

ResponsesReasoning (730-733)

ResponsesReasoningSummary (744-747)

core/providers/openai/responses.go (2)

core/schemas/responses.go (5)

ResponsesMessage (313-326)

ResponsesReasoning (730-733)

ResponsesMessageContentBlock (398-410)

ResponsesOutputMessageContentTypeReasoning (393-393)

ResponsesMessageContent (338-343)

core/providers/openai/types.go (2)

OpenAIResponsesRequest (143-152)

OpenAIResponsesRequestInput (110-113)

framework/streaming/responses.go (2)

core/schemas/responses.go (8)

ResponsesStreamResponseTypeReasoningSummaryTextDelta (1392-1392)

ResponsesMessage (313-326)

ResponsesInputMessageRoleAssistant (331-331)

ResponsesReasoning (730-733)

ResponsesReasoningSummary (744-747)

ResponsesMessageContent (338-343)

ResponsesMessageContentBlock (398-410)

ResponsesOutputMessageContentTypeReasoning (393-393)

core/schemas/utils.go (1)

Ptr (16-18)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)

GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check

🔇 Additional comments (20)

transports/bifrost-http/integrations/anthropic.go (1)

85-119: Streaming response handling logic looks correct.

The three-way branching based on response count handles:

Empty responses (OpenAI lifecycle events) → returns empty

Multiple events → concatenates as SSE format

Single event → returns directly

The SSE format construction at line 99 correctly follows event: <type>\ndata: <json>\n\n.

core/providers/openai/types.go (1)

154-192: LGTM - Custom marshaling correctly omits max_tokens for OpenAI.

The implementation correctly:

Preserves the custom Input marshaling via json.RawMessage

Copies reasoning fields while explicitly setting MaxTokens to nil

Follows the same pattern as OpenAIChatRequest.MarshalJSON above

Please verify that OpenAI's Responses API indeed does not support reasoning.max_tokens and should have it omitted from requests.

core/providers/openai/responses.go (1)

41-85: Reasoning transformation logic looks correct.

The bidirectional conversion between gpt-oss reasoning content blocks and standard OpenAI summaries+encrypted_content is well-structured. The three branches handle:

Skip messages with content blocks but no summaries for non-gpt-oss models

Convert summaries to content blocks for gpt-oss models

Preserve messages as-is for other cases

The model detection via strings.Contains(bifrostReq.Model, "gpt-oss") is fragile. Consider verifying this matches the actual model naming convention and whether a more robust check (e.g., a helper function or constant) would be appropriate.

ui/lib/types/logs.ts (1)

411-433: LGTM - Type rename aligns with backend schema changes.

The rename from ResponsesReasoningContent to ResponsesReasoningSummary correctly mirrors the backend ResponsesReasoningSummary struct in core/schemas/responses.go (lines 743-746), maintaining consistency across the codebase.

framework/streaming/responses.go (3)

497-534: LGTM - Reasoning summary streaming accumulation.

The new case for ResponsesStreamResponseTypeReasoningSummaryTextDelta correctly:

Searches for existing reasoning message by ItemID

Creates a new reasoning message with proper type and role if not found

Delegates to helper methods for delta and signature handling

626-679: Reasoning delta accumulation handles both storage modes correctly.

The helper properly branches on contentIndex:

With index: Stores in content blocks as reasoning_text type

Without index: Accumulates into ResponsesReasoning.Summary

The comment on line 667-668 acknowledges the current limitation of accumulating into a single summary entry.

681-727: Signature accumulation logic is correct.

Follows the same pattern as delta handling, storing signatures in either:

ContentBlock.Signature when contentIndex is provided

ResponsesReasoning.EncryptedContent otherwise

This aligns with the schema design where EncryptedContent serves as the reasoning-level signature storage.

core/schemas/responses.go (4)

68-68: LGTM - StopReason field addition.

The StopReason field appropriately handles non-OpenAI providers that return stop reasons in a different format, with a clear comment noting it's not part of OpenAI's spec.

398-402: LGTM - Signature field for content blocks.

Adding the Signature field to ResponsesMessageContentBlock enables per-block signature storage for reasoning content, which aligns with the streaming accumulation logic in framework/streaming/responses.go.

729-747: LGTM - Rename to ResponsesReasoningSummary.

The rename from ResponsesReasoningContent to ResponsesReasoningSummary better reflects the purpose of this struct and maintains consistency with the UI types in ui/lib/types/logs.ts.

1439-1441: LGTM - Signature field for streaming responses.

Adding Signature to BifrostResponsesStreamResponse enables streaming signature deltas alongside text deltas, supporting the reasoning accumulation logic.

core/providers/gemini/responses.go (2)

138-179: LGTM! Good handling of function calls and thought signatures.

The implementation correctly:

Avoids range loop variable capture by creating copies of functionCallID and functionCallName

Preserves Gemini's ThoughtSignature as encrypted content in a separate reasoning message

Initializes the Summary field as an empty slice, consistent with the new schema structure

609-629: LGTM! Proper bidirectional conversion with safe look-ahead.

The look-ahead mechanism correctly:

Checks array bounds before accessing the next message

Validates the next message is a reasoning type with encrypted content

Preserves the thought signature from the Bifrost reasoning message back to Gemini's format

This maintains consistency with the reverse conversion in convertGeminiCandidatesToResponsesOutput.

core/providers/cohere/responses.go (7)

17-17: LGTM! Proper state tracking for reasoning content.

The ReasoningContentIndices field is correctly:

Initialized in the pool's New function

Handled with defensive nil checks in acquireCohereResponsesStreamState

Cleared in the flush method, consistent with other map fields

This enables proper tracking of reasoning blocks during streaming conversion.

Also applies to: 34-34, 64-68, 106-110

318-368: LGTM! Proper reasoning content lifecycle handling.

The thinking/reasoning block handling correctly:

Creates a reasoning message with the appropriate type and empty Summary slice

Tracks the content index in ReasoningContentIndices for downstream event emission

Emits OpenAI-style lifecycle events (output_item.added, content_part.added)

Generates stable item IDs consistent with other content types

395-410: LGTM! Correct differentiation between text and reasoning deltas.

The implementation properly emits reasoning_summary_text.delta events for thinking content (line 400) instead of output_text.delta, ensuring downstream consumers can distinguish between regular text and reasoning updates.

420-449: LGTM! Proper reasoning block cleanup and event emission.

The content end handling correctly:

Uses ReasoningContentIndices to differentiate reasoning from text blocks

Emits reasoning_summary_text.done for reasoning (line 425) vs. output_text.done for text (line 454)

Cleans up the tracking map (line 449) to prevent memory leaks

977-1112: LGTM! Comprehensive message conversion with proper state management.

The conversion function correctly handles:

Accumulation of reasoning blocks via pendingReasoningContentBlocks

Association of reasoning with assistant messages

Proper flushing of pending blocks at function end (lines 1090-1100)

System message collection and prepending (lines 1102-1109)

The state machine logic is complex but appears sound for managing the various message types and their relationships.

850-932: LGTM! Comprehensive request conversion with proper parameter mapping.

The ToCohereResponsesRequest function correctly:

Maps standard parameters (temperature, top_p, max_tokens)

Extracts Cohere-specific options from ExtraParams (top_k, thinking, penalties)

Converts tools and tool choice using dedicated helper functions

Delegates message conversion to ConvertBifrostMessagesToCohereMessages

The structure is clean and follows established patterns in the codebase.

1370-1383: The reasoning message structure is correct and not redundant. Cohere provides reasoning as reasoning_text content blocks (line 1327-1331), which are correctly placed in Content.ContentBlocks while ResponsesReasoning.Summary remains empty. This dual-field pattern is intentional: ResponsesReasoning.Summary is used by providers that send reasoning summaries (e.g., some OpenAI models), while Content.ContentBlocks is used for reasoning_text blocks (Cohere, Bedrock, Anthropic). When converting back to provider format (line 1207-1209), the code checks Summary first—which is empty for content that originated from blocks, and that's correct.

Likely an incorrect or invalid review comment.

core/providers/anthropic/errors.go

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

framework/streaming/responses.go (1)
71-104: New Signature field is not preserved in deep copies

deepCopyResponsesStreamResponse copies Delta and LogProbs but never copies the new Signature field on BifrostResponsesStreamResponse (Lines 71–104), so any signature arriving from providers is lost when we stash stream responses in the accumulator.

deepCopyResponsesMessageContentBlock similarly never copies the new Signature (and still ignores FileID) on ResponsesMessageContentBlock (Lines 382–424), so block‑level signatures won’t survive accumulation either.

Both issues mean the newly added reasoning signature plumbing in buildCompleteMessageFromResponsesStreamChunks / appendReasoningSignatureToResponsesMessage can never see those signatures.

Consider updating both helpers along these lines:
func deepCopyResponsesStreamResponse(original *schemas.BifrostResponsesStreamResponse) *schemas.BifrostResponsesStreamResponse {
    ...
-   if original.Delta != nil {
-       copyDelta := *original.Delta
-       copy.Delta = &copyDelta
-   }
+   if original.Delta != nil {
+       copyDelta := *original.Delta
+       copy.Delta = &copyDelta
+   }
+   if original.Signature != nil {
+       copySig := *original.Signature
+       copy.Signature = &copySig
+   }
    ...
}
func deepCopyResponsesMessageContentBlock(original schemas.ResponsesMessageContentBlock) schemas.ResponsesMessageContentBlock {
-   copy := schemas.ResponsesMessageContentBlock{
-       Type: original.Type,
-   }
+   copy := schemas.ResponsesMessageContentBlock{
+       Type: original.Type,
+   }
+   if original.FileID != nil {
+       id := *original.FileID
+       copy.FileID = &id
+   }
+   if original.Signature != nil {
+       sig := *original.Signature
+       copy.Signature = &sig
+   }
    if original.Text != nil {
        copyText := *original.Text
        copy.Text = &copyText
    }
    ...
}
Also applies to: 382-424

♻️ Duplicate comments (4)

core/providers/utils/utils.go (1)
267-267: Revert to sonic.Marshal for production performance.

This change from sonic.Marshal to sonic.MarshalIndent was flagged in a previous review but remains unaddressed. The indented JSON increases payload size and bandwidth for all provider API requests without any documented justification. Provider APIs do not require formatted JSON.

Unless there is a specific requirement for indented JSON (which should be documented with a code comment), revert this change.

Apply this diff to revert the change:
-		jsonBody, err := sonic.MarshalIndent(convertedBody, "", "  ")
+		jsonBody, err := sonic.Marshal(convertedBody)
core/providers/cohere/responses.go (3)
140-172: Empty content block for invalid image URL is fragile

When cohereBlock.Type == CohereContentBlockTypeImage and cohereBlock.ImageURL == nil, the function returns a zero‑value ResponsesMessageContentBlock{} (Type == ""), which can later end up in ContentBlocks and surprise downstream logic that expects a valid Type.

Consider either:

Skipping such blocks entirely, or

Returning a text block indicating an invalid/missing image instead of an empty block.

1127-1145: Cohere tool choice mapping treats "auto" and unknown values as "required"

convertBifrostToolChoiceToCohereToolChoice:
switch *toolChoiceString {
case "none":
    choice := ToolChoiceNone
    return &choice
case "required", "auto", "function":
    choice := ToolChoiceRequired
    return &choice
default:
    choice := ToolChoiceRequired
    return &choice
}
Maps both "auto" and any unknown string to ToolChoiceRequired, which forces a tool call instead of letting the model decide. That’s a semantic change from OpenAI‑style "auto" and may not match Cohere’s API either.

Consider instead:

Mapping "none" → ToolChoiceNone,

"required" / "function" → ToolChoiceRequired,

"auto" (and other/unknown values) → nil to fall back to Cohere defaults, or a dedicated “auto” enum if available.

1193-1225: Encrypted reasoning content is exposed via a plain-text marker

In convertBifrostReasoningToCohereThinking, when EncryptedContent is present:
encryptedText := fmt.Sprintf("[ENCRYPTED_REASONING: %s]", *msg.ResponsesReasoning.EncryptedContent)
thinkingBlock := CohereContentBlock{
    Type:     CohereContentBlockTypeThinking,
    Thinking: &encryptedText,
}
This wraps the encrypted payload in a clear‑text marker and sends it to Cohere as “thinking” text, which may be contrary to the intent of keeping it opaque and could leak internal details.

Safer options:

Skip EncryptedContent entirely for Cohere (don’t send it), or

Represent only high‑level metadata (e.g., “[ENCRYPTED_REASONING_PRESENT]”) without including the ciphertext.

🧹 Nitpick comments (4)

core/providers/utils/utils.go (1)

951-960: The duplicate-string concern is overstated; refactoring is optional, not essential.

While the implementation has minor inefficiencies, the actual impact is negligible for this use case:

Duplicates are extremely unlikely: A 50-character random string from a 37-character alphabet offers ~10^80 possible combinations. The probability of duplicate outputs is vanishingly small, especially for API response message IDs generated during processing.

Inefficiency is minor: Creating a new rand.Source per call has overhead, but these calls occur during response transformation—not in a tight loop. This is unlikely to be a performance bottleneck.

Cryptographic security not required: These are internal message IDs, not authentication tokens or security-sensitive values.

If performance profiling shows this is a bottleneck, consider refactoring with sync.Once to initialize a package-level random source. However, this is not essential for the current usage pattern.

transports/bifrost-http/handlers/inference.go (1)

224-254: ResponsesRequest.UnmarshalJSON logic looks solid; fix comment wording and consider deduping with ChatRequest

This implementation correctly mirrors the ChatRequest flow: it protects the embedded BifrostParams from being shadowed by the custom ResponsesParameters.UnmarshalJSON, and it cleanly decodes the input union and params. No functional issues stand out.

Two small nits:

Line 236: the comment says "Unmarshal messages" but this block unmarshals the input field. Consider updating to avoid confusion.

The structure is now nearly identical to ChatRequest.UnmarshalJSON; if this pattern spreads further, a shared helper for "unmarshal BifrostParams + specific input + specific params" could reduce duplication, though it's not urgent.

core/providers/openai/responses.go (1)

42-84: Reasoning message skip / comment mismatch – please confirm intended behavior

For non‑gpt-oss models, reasoning messages with ResponsesReasoning but only ContentBlocks (no Summary, no EncryptedContent) are silently skipped (Lines 47–54). That drops those messages entirely instead of degrading them (e.g., into summaries or plain text). If such inputs can occur cross‑provider, this may be surprising; worth confirming that they can’t, or that dropping them is acceptable.

The comment “convert them to summaries” (Line 43) doesn’t match the code, which instead converts summaries to reasoning content blocks for gpt-oss when Content == nil (Lines 56–77). Updating the comment would avoid confusion.

core/providers/cohere/responses.go (1)

1304-1383: Reasoning summary content is only attached as blocks, not as Summary

convertSingleCohereMessageToBifrostMessages collects CohereContentBlockTypeThinking blocks into reasoningContentBlocks and then:

Prepends a ResponsesMessageTypeReasoning message with Content.ContentBlocks = reasoningContentBlocks and

Initializes ResponsesReasoning.Summary as an empty slice.

Given the new schema encourages using ResponsesReasoning.Summary for reasoning summaries, this is fine as long as downstream code expects reasoning_text content blocks and not populated Summary entries for Cohere outputs. If you intend to surface reasoning summaries uniformly across providers, you might later want to mirror those blocks into ResponsesReasoning.Summary as well.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a15c48b and d4bfce4.

📒 Files selected for processing (13)

core/providers/anthropic/errors.go (2 hunks)
core/providers/anthropic/types.go (3 hunks)
core/providers/cohere/responses.go (9 hunks)
core/providers/gemini/responses.go (2 hunks)
core/providers/openai/openai.go (1 hunks)
core/providers/openai/responses.go (2 hunks)
core/providers/openai/types.go (2 hunks)
core/providers/utils/utils.go (3 hunks)
core/schemas/responses.go (5 hunks)
framework/streaming/responses.go (2 hunks)
transports/bifrost-http/handlers/inference.go (1 hunks)
transports/bifrost-http/integrations/anthropic.go (3 hunks)
ui/lib/types/logs.ts (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (5)

core/providers/anthropic/errors.go
ui/lib/types/logs.ts
core/providers/openai/openai.go
core/providers/openai/types.go
core/providers/anthropic/types.go

🧰 Additional context used

📓 Path-based instructions (1)

**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

core/providers/utils/utils.go
core/providers/openai/responses.go
transports/bifrost-http/handlers/inference.go
transports/bifrost-http/integrations/anthropic.go
core/providers/gemini/responses.go
core/providers/cohere/responses.go
core/schemas/responses.go
framework/streaming/responses.go

🧬 Code graph analysis (6)

core/providers/openai/responses.go (2)

core/schemas/responses.go (5)

ResponsesMessage (313-326)

ResponsesReasoning (730-733)

ResponsesMessageContentBlock (398-410)

ResponsesOutputMessageContentTypeReasoning (393-393)

ResponsesMessageContent (338-343)

core/providers/openai/types.go (2)

OpenAIResponsesRequest (143-152)

OpenAIResponsesRequestInput (110-113)

transports/bifrost-http/handlers/inference.go (2)

core/schemas/bifrost.go (1)

ResponsesRequest (91-91)

core/schemas/responses.go (1)

ResponsesParameters (86-113)

core/providers/gemini/responses.go (3)

ui/lib/types/logs.ts (6)

FunctionCall (157-160)

ResponsesToolMessage (402-408)

ResponsesMessage (422-437)

ResponsesMessageContent (399-399)

ResponsesReasoning (416-419)

ResponsesReasoningSummary (411-414)

core/providers/gemini/types.go (4)

FunctionCall (1091-1101)

Role (13-13)

Content (922-930)

Type (778-778)

core/schemas/responses.go (5)

ResponsesToolMessage (461-481)

ResponsesMessage (313-326)

ResponsesMessageContent (338-343)

ResponsesReasoning (730-733)

ResponsesReasoningSummary (744-747)

core/providers/cohere/responses.go (3)

core/providers/cohere/types.go (22)

CohereContentBlock (142-156)

CohereContentBlockTypeText (134-134)

CohereContentBlockTypeImage (135-135)

CohereContentBlockTypeThinking (136-136)

CohereStreamEvent (387-392)

StreamEventMessageStart (372-372)

StreamEventContentStart (373-373)

StreamEventContentDelta (374-374)

StreamEventContentEnd (375-375)

StreamEventToolPlanDelta (376-376)

StreamEventToolCallStart (377-377)

StreamEventToolCallDelta (378-378)

StreamEventToolCallEnd (379-379)

StreamEventCitationStart (380-380)

StreamEventCitationEnd (381-381)

StreamEventMessageEnd (382-382)

StreamEventDebug (383-383)

CohereChatRequest (14-31)

CohereMessage (50-56)

NewBlocksContent (105-109)

NewStringContent (98-102)

CohereImageURL (159-161)

core/schemas/responses.go (14)

BifrostResponsesResponse (45-84)

ResponsesStreamResponseTypeCreated (1362-1362)

ResponsesStreamResponseTypeInProgress (1363-1363)

ResponsesStreamResponseTypeOutputTextDone (1375-1375)

ResponsesStreamResponseTypeContentPartDone (1372-1372)

ResponsesMessage (313-326)

ResponsesStreamResponseTypeOutputItemDone (1369-1369)

ResponsesStreamResponseTypeOutputItemAdded (1368-1368)

ResponsesStreamResponseTypeContentPartAdded (1371-1371)

ResponsesStreamResponseTypeOutputTextDelta (1374-1374)

ResponsesStreamResponseTypeFunctionCallArgumentsDelta (1380-1380)

ResponsesStreamResponseTypeFunctionCallArgumentsDone (1381-1381)

ResponsesResponseUsage (253-260)

ResponsesToolChoice (958-961)

core/schemas/utils.go (2)

SafeExtractIntPointer (486-494)

SafeExtractFromMap (519-525)

core/schemas/responses.go (2)

core/providers/gemini/types.go (1)

Type (778-778)

ui/lib/types/logs.ts (2)

ResponsesMessageContentBlockType (351-358)

ResponsesReasoningSummary (411-414)

framework/streaming/responses.go (4)

core/schemas/responses.go (9)

ResponsesMessage (313-326)

ResponsesMessageTypeReasoning (306-306)

ResponsesInputMessageRoleAssistant (331-331)

ResponsesReasoning (730-733)

ResponsesReasoningSummary (744-747)

ResponsesMessageContent (338-343)

ResponsesMessageContentBlock (398-410)

ResponsesOutputMessageContentTypeReasoning (393-393)

ResponsesReasoningContentBlockTypeSummaryText (740-740)

core/providers/gemini/types.go (3)

Type (778-778)

Role (13-13)

Content (922-930)

core/schemas/utils.go (1)

Ptr (16-18)

core/utils.go (1)

Ptr (51-53)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)

GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check

🔇 Additional comments (3)

core/providers/gemini/responses.go (1)

138-179: Function-call ↔ thought-signature round-trip looks consistent

The new FunctionCall conversion:

Emits a ResponsesToolMessage with CallID, Name, and stringified Args.

Emits a separate reasoning message carrying EncryptedContent from ThoughtSignature.

And the reverse path:

Rebuilds FunctionCall from ResponsesToolMessage and, if the next message is a reasoning message with EncryptedContent, attaches it as ThoughtSignature.

This is internally consistent and nil‑safe; just keep in mind the assumption that the reasoning message immediately follows the function-call message when constructing messages elsewhere.

Also applies to: 609-655

transports/bifrost-http/integrations/anthropic.go (1)

73-103: Clarify SSE contract for multi-event Anthropic streaming

When ToAnthropicResponsesStreamResponse returns more than one event, the converter now:

Marshals each event to JSON and concatenates them as a single SSE string ("event: %s\ndata: %s\n\n"), and

Returns ("", combinedContent, nil).

This assumes the upstream streaming writer treats a non-empty payload with an empty event name as “already formatted SSE” and writes it verbatim. If the writer instead always wraps (eventName, data) into its own SSE envelope, this will double‑wrap or drop the event type.

Please double‑check the StreamConfig writer path to ensure:

event == "" is indeed interpreted as “raw SSE payload”, and

It’s acceptable to skip individual Anthropic events that fail sonic.Marshal rather than failing the whole chunk.

core/schemas/responses.go (1)

45-84: Schema extensions for stop reason, reasoning summaries, and signatures look coherent

The additions:

StopReason on BifrostResponsesResponse,

Signature on ResponsesMessageContentBlock,

the new ResponsesReasoningSummary type and updated ResponsesReasoning.Summary,

and Delta/Signature on BifrostResponsesStreamResponse

are structurally consistent with how the rest of the file models union types and streaming events.

The main follow‑up risk is making sure all converters and helpers (deep copies, provider adapters, streaming accumulators) are updated to propagate Signature and the new Summary shape; some of that is already wired up, but a few helpers still need updates (see streaming/cohere comments).

Also applies to: 399-410, 729-747, 1439-1442

core/providers/cohere/responses.go

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (6)

core/providers/utils/utils.go (1)
267-271: Avoid MarshalIndent in hot-path request body marshalling.

CheckContextAndGetRequestBody is on the request path for all providers; using sonic.MarshalIndent here increases allocations and bloats every request payload with whitespace. Unless a specific upstream API strictly requires pretty-printed JSON, it’s better to keep the wire format compact and, if needed, pretty-print only for logging.

Consider reverting to sonic.Marshal:
-		jsonBody, err := sonic.MarshalIndent(convertedBody, "", "  ")
+		jsonBody, err := sonic.Marshal(convertedBody)
If pretty JSON is truly required for a given provider, please document that requirement and consider making indentation opt‑in rather than the default for all providers.
core/providers/cohere/responses.go (5)
140-172: Avoid returning zero-value content block for invalid image URL

When cohereBlock.ImageURL == nil, this returns schemas.ResponsesMessageContentBlock{} with a zero Type, which can confuse downstream consumers that expect a valid type or no block at all. A small sentinel or text fallback is safer.
 case CohereContentBlockTypeImage:
-	// For images, create a text block describing the image
-	if cohereBlock.ImageURL == nil {
-		// Skip invalid image blocks without ImageURL
-		return schemas.ResponsesMessageContentBlock{}
-	}
-	return schemas.ResponsesMessageContentBlock{
-		Type: schemas.ResponsesInputMessageContentBlockTypeImage,
-		ResponsesInputMessageContentBlockImage: &schemas.ResponsesInputMessageContentBlockImage{
-			ImageURL: &cohereBlock.ImageURL.URL,
-		},
-	}
+	if cohereBlock.ImageURL == nil || cohereBlock.ImageURL.URL == "" {
+		// Return a small text sentinel instead of a zero-value block
+		return schemas.ResponsesMessageContentBlock{
+			Type: schemas.ResponsesInputMessageContentBlockTypeText,
+			Text: schemas.Ptr("[Image block with missing URL]"),
+		}
+	}
+	return schemas.ResponsesMessageContentBlock{
+		Type: schemas.ResponsesInputMessageContentBlockTypeImage,
+		ResponsesInputMessageContentBlockImage: &schemas.ResponsesInputMessageContentBlockImage{
+			ImageURL: &cohereBlock.ImageURL.URL,
+		},
+	}
174-488: Fix nil-dereference when generating reasoning item IDs in streaming

In the StreamEventContentStart handler for CohereContentBlockTypeThinking, state.MessageID is dereferenced before the nil check:
// Generate stable ID for reasoning item
itemID := fmt.Sprintf("msg_%s_reasoning_%d", *state.MessageID, outputIndex)
if state.MessageID == nil {
    itemID = fmt.Sprintf("reasoning_%d", outputIndex)
}
If state.MessageID is nil (e.g., no message_start ID), this will panic.

A nil-safe branch avoids the panic:
-				// Generate stable ID for reasoning item
-				itemID := fmt.Sprintf("msg_%s_reasoning_%d", *state.MessageID, outputIndex)
-				if state.MessageID == nil {
-					itemID = fmt.Sprintf("reasoning_%d", outputIndex)
-				}
+				// Generate stable ID for reasoning item
+				var itemID string
+				if state.MessageID != nil {
+					itemID = fmt.Sprintf("msg_%s_reasoning_%d", *state.MessageID, outputIndex)
+				} else {
+					itemID = fmt.Sprintf("reasoning_%d", outputIndex)
+				}
The rest of the reasoning/text streaming (reasoning_summary_text.{delta,done}, content_part.{added,done}, output_item.{added,done}) looks coherent and matches the intended OpenAI-style lifecycle.

1127-1146: Tool choice "auto" should not be forced to ToolChoiceRequired

The current mapping forces both "auto" and unknown strings to ToolChoiceRequired, which changes semantics and can unintentionally force tool calls:
case "required", "auto", "function":
    choice := ToolChoiceRequired
    return &choice
default:
    choice := ToolChoiceRequired
    return &choice
Safer behavior is to only map explicit "required"/"function" and "none", letting "auto" (and unknown strings) fall back to Cohere’s default behavior:
	if toolChoiceString != nil {
		switch *toolChoiceString {
		case "none":
			choice := ToolChoiceNone
			return &choice
-		case "required", "auto", "function":
+		case "required", "function":
			choice := ToolChoiceRequired
-			return &choice
-		default:
-			choice := ToolChoiceRequired
-			return &choice
+			return &choice
+		case "auto":
+			// Let Cohere use its default "auto" behavior.
+			return nil
+		default:
+			// Unknown strings: fall back to provider defaults.
+			return nil
		}
	}
What are the valid `tool_choice` values and semantics for Cohere's Chat API, and how should `"auto"`, `"none"`, and `"required"` be mapped to the Go SDK enums?
1193-1229: Do not expose encrypted reasoning content as plain text

convertBifrostReasoningToCohereThinking currently converts encrypted reasoning into a readable string:
} else if msg.ResponsesReasoning.EncryptedContent != nil {
    encryptedText := fmt.Sprintf("[ENCRYPTED_REASONING: %s]", *msg.ResponsesReasoning.EncryptedContent)
    thinkingBlock := CohereContentBlock{
        Type:     CohereContentBlockTypeThinking,
        Thinking: &encryptedText,
    }
    thinkingBlocks = append(thinkingBlocks, thinkingBlock)
}
This leaks the encrypted payload in clear form, which defeats the purpose of keeping it opaque when forwarding to another provider.

Better to skip encrypted reasoning entirely for Cohere:
-	} else if msg.ResponsesReasoning.EncryptedContent != nil {
-		// Cohere doesn't have a direct equivalent to encrypted content,
-		// so we'll store it as a regular thinking block with a special marker
-		encryptedText := fmt.Sprintf("[ENCRYPTED_REASONING: %s]", *msg.ResponsesReasoning.EncryptedContent)
-		thinkingBlock := CohereContentBlock{
-			Type:     CohereContentBlockTypeThinking,
-			Thinking: &encryptedText,
-		}
-		thinkingBlocks = append(thinkingBlocks, thinkingBlock)
+	} else if msg.ResponsesReasoning.EncryptedContent != nil {
+		// Cohere doesn't support encrypted reasoning; skip forwarding it so it remains opaque.
 	}
The existing handling of ContentBlocks and Summary already covers non-encrypted reasoning.

1231-1265: Access CallID via embedded struct to avoid nil-pointer panic

convertBifrostFunctionCallToCohereMessage reads msg.CallID directly:
if msg.CallID != nil {
    toolCall.ID = msg.CallID
}
Because CallID is promoted from the embedded *ResponsesToolMessage, this will panic if msg.ResponsesToolMessage is nil.

Guard the embedded pointer explicitly:
-	if msg.CallID != nil {
-		toolCall.ID = msg.CallID
-	}
+	if msg.ResponsesToolMessage != nil && msg.ResponsesToolMessage.CallID != nil {
+		toolCall.ID = msg.ResponsesToolMessage.CallID
+	}
The rest of the function already checks msg.ResponsesToolMessage != nil for Arguments and Name.

🧹 Nitpick comments (4)

core/providers/openai/responses.go (1)
57-59: Redundant condition check.

len(message.ResponsesReasoning.Summary) > 0 is checked twice on lines 57 and 59.
-			if len(message.ResponsesReasoning.Summary) > 0 &&
-				strings.Contains(bifrostReq.Model, "gpt-oss") &&
-				len(message.ResponsesReasoning.Summary) > 0 &&
+			if len(message.ResponsesReasoning.Summary) > 0 &&
+				strings.Contains(bifrostReq.Model, "gpt-oss") &&
 				message.Content == nil {
transports/bifrost-http/integrations/anthropic.go (2)

94-98: Use injected logger instead of standard log package.

Using log.Printf directly bypasses the structured logger passed to NewAnthropicRouter. This can cause inconsistent logging behavior and lose context in production environments.

Consider passing the logger to the converter function or using a closure to capture it. If that's not feasible, at minimum document why log is used here.

74-78: Remove or clarify commented-out code blocks.

Multiple commented-out code blocks are present. If this code is no longer needed, remove it to reduce confusion. If it's temporarily disabled, add a TODO comment explaining when it should be re-enabled.

Also applies to: 103-117
core/providers/cohere/responses.go (1)
1303-1429: Consider setting assistant role on synthesized reasoning messages

convertSingleCohereMessageToBifrostMessages builds a separate reasoning ResponsesMessage with populated ResponsesReasoning and ContentBlocks, but it doesn’t set a Role. For consistency with other providers and with how reasoning is emitted elsewhere, it’s useful to mark these as assistant-originated:
	if len(reasoningContentBlocks) > 0 {
+		role := schemas.ResponsesInputMessageRoleAssistant
 		reasoningMessage := schemas.ResponsesMessage{
 			ID:   schemas.Ptr("rs_" + fmt.Sprintf("%d", time.Now().UnixNano())),
 			Type: schemas.Ptr(schemas.ResponsesMessageTypeReasoning),
+			Role: &role,
 			ResponsesReasoning: &schemas.ResponsesReasoning{
 				Summary: []schemas.ResponsesReasoningSummary{},
 			},
 			Content: &schemas.ResponsesMessageContent{
 				ContentBlocks: reasoningContentBlocks,
 			},
 		}
This is a behavioral refinement rather than a correctness fix, but it will likely make downstream consumers’ role-based handling more predictable.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d4bfce4 and bf9c361.

📒 Files selected for processing (14)

core/providers/anthropic/chat.go (1 hunks)
core/providers/anthropic/errors.go (2 hunks)
core/providers/anthropic/types.go (4 hunks)
core/providers/cohere/responses.go (9 hunks)
core/providers/gemini/responses.go (2 hunks)
core/providers/openai/openai.go (1 hunks)
core/providers/openai/responses.go (2 hunks)
core/providers/openai/types.go (2 hunks)
core/providers/utils/utils.go (3 hunks)
core/schemas/responses.go (5 hunks)
framework/streaming/responses.go (2 hunks)
transports/bifrost-http/handlers/inference.go (1 hunks)
transports/bifrost-http/integrations/anthropic.go (3 hunks)
ui/lib/types/logs.ts (2 hunks)

🚧 Files skipped from review as they are similar to previous changes (4)

core/providers/openai/openai.go
ui/lib/types/logs.ts
core/providers/anthropic/errors.go
transports/bifrost-http/handlers/inference.go

🧰 Additional context used

📓 Path-based instructions (1)

**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

core/providers/openai/types.go
core/providers/utils/utils.go
transports/bifrost-http/integrations/anthropic.go
core/providers/gemini/responses.go
core/providers/anthropic/chat.go
core/providers/openai/responses.go
framework/streaming/responses.go
core/providers/anthropic/types.go
core/schemas/responses.go
core/providers/cohere/responses.go

🧬 Code graph analysis (5)

core/providers/openai/types.go (2)

core/schemas/responses.go (1)

ResponsesParametersReasoning (233-238)

ui/lib/types/logs.ts (1)

ResponsesParametersReasoning (512-519)

transports/bifrost-http/integrations/anthropic.go (1)

core/providers/gemini/types.go (1)

Type (778-778)

core/providers/gemini/responses.go (3)

ui/lib/types/logs.ts (6)

FunctionCall (157-160)

ResponsesToolMessage (402-408)

ResponsesMessage (422-437)

ResponsesMessageContent (399-399)

ResponsesReasoning (416-419)

ResponsesReasoningSummary (411-414)

core/providers/gemini/types.go (4)

FunctionCall (1091-1101)

Role (13-13)

Content (922-930)

Type (778-778)

core/schemas/responses.go (5)

ResponsesToolMessage (461-481)

ResponsesMessage (313-326)

ResponsesMessageContent (338-343)

ResponsesReasoning (730-733)

ResponsesReasoningSummary (744-747)

core/providers/openai/responses.go (2)

core/schemas/responses.go (4)

ResponsesMessage (313-326)

ResponsesReasoning (730-733)

ResponsesMessageContentBlock (398-410)

ResponsesMessageContent (338-343)

core/providers/openai/types.go (2)

OpenAIResponsesRequest (143-152)

OpenAIResponsesRequestInput (110-113)

core/schemas/responses.go (2)

core/providers/gemini/types.go (1)

Type (778-778)

ui/lib/types/logs.ts (2)

ResponsesMessageContentBlockType (351-358)

ResponsesReasoningSummary (411-414)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check

🔇 Additional comments (23)

core/schemas/responses.go (4)

68-68: LGTM: StopReason field addition.

The comment clearly documents that this field is "Not in OpenAI's spec, but sent by other providers", which provides useful context for maintainers.

399-402: LGTM: Signature field addition to ResponsesMessageContentBlock.

The Signature field for reasoning content blocks aligns with the streaming updates in BifrostResponsesStreamResponse and the UI type definitions.

730-747: LGTM: ResponsesReasoning and ResponsesReasoningSummary refactoring.

The transition from []ResponsesReasoningContent to []ResponsesReasoningSummary with explicit Type and Text fields provides clearer semantics. This aligns with the UI type ResponsesReasoningSummary in ui/lib/types/logs.ts:410-413.

1439-1441: LGTM: Streaming response signature support.

Adding Signature to BifrostResponsesStreamResponse enables proper signature propagation during streaming, which is essential for reasoning content integrity.

core/providers/anthropic/types.go (2)

135-145: LGTM: RedactedThinking content block type.

Adding AnthropicContentBlockTypeRedactedThinking enables proper handling of redacted thinking content blocks from Anthropic's API.

153-153: LGTM: Data field for redacted thinking.

The Data field with clear documentation for encrypted data in redacted thinking blocks is appropriate.

core/providers/openai/types.go (1)

154-192: LGTM: Custom marshaling to strip MaxTokens for OpenAI.

The implementation correctly strips MaxTokens from reasoning parameters before sending to OpenAI, since OpenAI doesn't support this field (it's Anthropic-specific per the schema documentation). The approach using an alias struct and json.RawMessage for preserving custom Input marshaling is sound.

core/providers/openai/responses.go (1)

42-85: LGTM: Reasoning content handling for OpenAI models.

The logic correctly differentiates between:

gpt-oss models: which use reasoning_text content blocks

Other OpenAI models: which use summaries + encrypted_content

The transformation ensures proper format compatibility when sending requests to OpenAI.

transports/bifrost-http/integrations/anthropic.go (1)

86-119: LGTM: Multi-event SSE aggregation logic.

The streaming response handling correctly aggregates multiple Anthropic events into proper SSE format and handles edge cases (empty responses, single events). The error logging without failing allows the stream to continue processing remaining events.

framework/streaming/responses.go (3)

497-534: LGTM: ReasoningSummaryTextDelta handling.

The implementation correctly:

Searches for existing reasoning messages by ItemID (reverse iteration for efficiency)

Creates new reasoning messages when needed with proper initialization

Handles both text deltas and signature deltas in a single pass

The guard condition on line 500 ensures we have at least one payload and a valid ItemID before processing.

626-679: LGTM: Reasoning delta accumulation with dual-path logic.

The helper correctly handles two accumulation paths:

With ContentIndex: Accumulates into content blocks as reasoning_text type

Without ContentIndex: Accumulates into ResponsesReasoning.Summary

The TODO comment on lines 667-668 appropriately notes future enhancement potential for multiple summary entries.

681-727: LGTM: Signature accumulation with proper field mapping.

The signature helper correctly maps:

With ContentIndex → Signature field in content block

Without ContentIndex → EncryptedContent field in ResponsesReasoning

This aligns with the schema design where EncryptedContent serves as the signature/encrypted data at the reasoning level.

core/providers/anthropic/chat.go (1)

608-634: PartialJSON guard condition now emits empty string deltas.

The condition changed from chunk.Delta.PartialJSON != nil && *chunk.Delta.PartialJSON != "" to just chunk.Delta.PartialJSON != nil. This allows empty string partial JSON to be emitted as deltas. Evidence shows this is intentional: responses.go:3069 explicitly creates empty PartialJSON deltas, and the accumulation logic (responses.go:470, 478) safely concatenates even empty strings. Validation of non-empty Arguments is deferred to after accumulation completes (as seen in test utilities validating the final assembled result). This change is safe and maintains streaming consistency.

core/providers/gemini/responses.go (2)

138-179: Function-call → tool message + reasoning signature path looks solid

The new FunctionCall branch builds a proper ResponsesToolMessage (with JSON-serialized args) and a separate reasoning message carrying Summary (initialized empty) and EncryptedContent for the thought signature. This cleanly aligns Gemini function calls with the updated ResponsesReasoning schema and avoids range-variable capture issues.

596-629: Reconstruction of Gemini FunctionCall + ThoughtSignature is consistent with emit side

The FunctionCall reconstruction from ResponsesToolMessage (including CallID and decoded Arguments) and the lookahead-based ThoughtSignature attachment match how convertGeminiCandidatesToResponsesOutput emits the function-call + reasoning pair. As long as the reasoning message immediately follows the function-call (which this file enforces), this round-trip is coherent.

core/providers/cohere/responses.go (8)

13-25: ReasoningContentIndices tracking and reset look correct

Adding ReasoningContentIndices into CohereResponsesStreamState, initializing it in the pool, and clearing it in both acquireCohereResponsesStreamState and flush ensures per-stream tracking of reasoning content indices without leaking state between streams. No issues here.

Also applies to: 29-41, 45-77, 89-118

490-567: Streaming tool plan, tool calls, citations, and lifecycle wiring look consistent

The handling of StreamEventToolPlanDelta, tool call start/delta/end, citation start/end, and StreamEventMessageEnd appears internally consistent:

Tool plan text is emitted as normal output_text.delta on a dedicated output index, with proper close-out events before tool calls.

Tool call arguments are buffered per-output-index and finalized with function_call_arguments.done followed by output_item.done.

Citations become OutputTextAnnotationAdded/OutputTextAnnotationDone with indices wired via ContentIndexToOutputIndex.

Message end emits a single response.completed with aggregated usage and stable CreatedAt.

No additional correctness issues stand out beyond the reasoning-ID nil-deref already called out.

Also applies to: 612-735, 735-803, 804-848

850-932: Bifrost → Cohere request conversion is aligned with Responses params

ToCohereResponsesRequest cleanly maps core parameters (MaxOutputTokens, Temperature, TopP, top_k, stop sequences, penalties) and the thinking extra param into the Cohere request, and converts tools/tool choice/messages via the new helpers. The shape looks correct and side-effect free.

935-975: Cohere → Bifrost response conversion is straightforward

ToBifrostResponsesResponse correctly maps the Cohere ID, computes CreatedAt at receipt time, translates usage (including cached tokens), and uses ConvertCohereMessagesToBifrostMessages(..., true) for the output message. This path looks correct; only note is that Model isn’t propagated here, but that’s consistent with the current implementation.

977-1112: Bidirectional message mapping handles system, reasoning, and tools coherently

ConvertBifrostMessagesToCohereMessages and ConvertCohereMessagesToBifrostMessages:

Separate system content and prepend it as a single system message to Cohere.

Accumulate reasoning messages into CohereContentBlockTypeThinking and attach them to the next assistant message.

Convert function calls and function outputs into Cohere’s tool_calls / role:"tool" structures and back.

The control flow around currentAssistantMessage and pendingReasoningContentBlocks looks sound and flushes consistently at boundaries.

1150-1191: System and regular message conversions are straightforward

convertBifrostMessageToCohereSystemContent and convertBifrostMessageToCohereMessage simply flatten text blocks and map roles, and only emit content when present. Both look correct and side-effect free.

1267-1301: Function call output → Cohere role:"tool" mapping looks correct

convertBifrostFunctionCallOutputToCohereMessage only emits a tool message when CallID is present, reconstructs content from either Content or the OpenAI-style Output wrapper, and sets ToolCallID appropriately. This matches the Responses schema and Cohere expectations.

1431-1465: Content block → Cohere block conversion covers key cases

convertResponsesMessageContentBlocksToCohere correctly:

Maps both input and output text types to CohereContentBlockTypeText.

Converts image URL blocks into CohereContentBlockTypeImage.

Maps reasoning blocks to CohereContentBlockTypeThinking.

This provides a clean, minimal surface for Cohere without surprising behavior.

core/providers/anthropic/types.go

core/providers/utils/utils.go

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (4)

core/providers/anthropic/types.go (1)

356-356: This StopSequence change was already flagged in a previous review.

The change from *string with omitempty to string without omitempty breaks API compatibility, as previously noted. Empty strings will serialize as "stop_sequence": "" instead of being omitted or representing null.
core/providers/cohere/responses.go (3)
1238-1256: Guard access to promoted CallID field to prevent panics

On line 1244, msg.CallID is accessed directly, but CallID is promoted from the embedded *ResponsesToolMessage. If msg.ResponsesToolMessage is nil, accessing msg.CallID will panic even inside the if condition.

Apply this fix:
-	if msg.CallID != nil {
-		toolCall.ID = msg.CallID
+	if msg.ResponsesToolMessage != nil && msg.ResponsesToolMessage.CallID != nil {
+		toolCall.ID = msg.ResponsesToolMessage.CallID
 	}
The same pattern should be applied to any other accesses of promoted fields from msg.ResponsesToolMessage throughout the function.

319-327: Critical: Nil-pointer dereference on state.MessageID

The code dereferences *state.MessageID on line 324 before checking if it's nil on line 325, which will cause a panic.

Apply this fix:
-			// Generate stable ID for reasoning item
-			itemID := fmt.Sprintf("msg_%s_reasoning_%d", *state.MessageID, outputIndex)
-			if state.MessageID == nil {
-				itemID = fmt.Sprintf("reasoning_%d", outputIndex)
-			}
+			// Generate stable ID for reasoning item
+			var itemID string
+			if state.MessageID != nil {
+				itemID = fmt.Sprintf("msg_%s_reasoning_%d", *state.MessageID, outputIndex)
+			} else {
+				itemID = fmt.Sprintf("reasoning_%d", outputIndex)
+			}
Note: This issue was flagged in a previous review but appears to remain unaddressed in the current code.

1131-1142: "auto" tool choice should map to ToolChoiceAuto, not ToolChoiceRequired

The "auto" tool choice is incorrectly mapped to ToolChoiceRequired. Cohere's API supports three tool choice modes: NONE, REQUIRED, and AUTO (defined as constants in core/providers/cohere/types.go). Mapping "auto" to ToolChoiceRequired changes semantics—auto allows the model to decide whether to call a tool, while required forces a tool call.
-		case "required", "auto", "function":
-			choice := ToolChoiceRequired
-			return &choice
+		case "required", "function":
+			choice := ToolChoiceRequired
+			return &choice
+		case "auto":
+			choice := ToolChoiceAuto
+			return &choice

🧹 Nitpick comments (4)

core/providers/vertex/errors.go (1)

14-28: Centralized body decoding and error classification look correct

Using providerUtils.CheckAndDecodeBody and switching all sonic.Unmarshal calls to decodedBody is a solid improvement: it handles content‑encoding consistently and cleanly separates decode failures (ErrProviderResponseDecode) from JSON shape issues (ErrProviderResponseUnmarshal). The fallback chain for OpenAI/Vertex/VertexValidation error formats remains intact and behaviorally equivalent apart from the improved error typing. I don’t see new correctness or panic risks here; this aligns well with the shared decoding utils used in other providers.
ui/app/workspace/logs/views/logResponsesMessageView.tsx (1)
202-204: Use strict equality (===) for type comparison.

The guard logic is correct, but TypeScript/JavaScript best practice is to use strict equality === instead of loose equality == for type comparisons.
-	if (message.type == "reasoning" && (!message.summary || message.summary.length === 0) && !message.encrypted_content) {
+	if (message.type === "reasoning" && (!message.summary || message.summary.length === 0) && !message.encrypted_content) {
 		return null;
 	}
core/schemas/responses.go (1)
731-733: Consider using a pointer or omitempty behavior for Summary slice.

The Summary field is a non-pointer slice without omitempty. In Go, an empty slice []ResponsesReasoningSummary{} will serialize as "summary": [] rather than being omitted. If the intent is to omit the field when empty (consistent with the UI guard checking message.summary.length === 0), consider adding omitempty.
 type ResponsesReasoning struct {
-	Summary          []ResponsesReasoningSummary `json:"summary"`
+	Summary          []ResponsesReasoningSummary `json:"summary,omitempty"`
 	EncryptedContent *string                     `json:"encrypted_content,omitempty"`
 }
transports/bifrost-http/integrations/anthropic.go (1)

111-113: Use the structured logger instead of log.Printf.

The router receives a schemas.Logger parameter (as seen in NewAnthropicRouter), but this error logging uses the standard library's log.Printf. For consistency with the codebase's logging practices, use the structured logger.

Consider passing the logger to the stream converter or using a context-aware logging approach. If the logger isn't accessible in this closure, the error could be returned or the design adjusted to provide logger access.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bf9c361 and bcef5b4.

📒 Files selected for processing (18)

core/providers/anthropic/chat.go (1 hunks)
core/providers/anthropic/errors.go (2 hunks)
core/providers/anthropic/types.go (4 hunks)
core/providers/cohere/responses.go (9 hunks)
core/providers/gemini/responses.go (2 hunks)
core/providers/openai/responses.go (2 hunks)
core/providers/openai/types.go (2 hunks)
core/providers/utils/utils.go (4 hunks)
core/providers/vertex/errors.go (1 hunks)
core/schemas/bifrost.go (1 hunks)
core/schemas/responses.go (5 hunks)
framework/streaming/responses.go (2 hunks)
transports/bifrost-http/handlers/inference.go (1 hunks)
transports/bifrost-http/integrations/anthropic.go (4 hunks)
transports/bifrost-http/integrations/router.go (3 hunks)
ui/app/workspace/logs/views/logResponsesMessageView.tsx (1 hunks)
ui/app/workspace/logs/views/logResponsesOutputView.tsx (0 hunks)
ui/lib/types/logs.ts (2 hunks)

💤 Files with no reviewable changes (1)

ui/app/workspace/logs/views/logResponsesOutputView.tsx

🚧 Files skipped from review as they are similar to previous changes (5)

transports/bifrost-http/handlers/inference.go
core/providers/anthropic/chat.go
core/providers/anthropic/errors.go
core/providers/utils/utils.go
ui/lib/types/logs.ts

🧰 Additional context used

📓 Path-based instructions (1)

**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

ui/app/workspace/logs/views/logResponsesMessageView.tsx
core/schemas/bifrost.go
core/providers/gemini/responses.go
core/providers/vertex/errors.go
core/schemas/responses.go
transports/bifrost-http/integrations/anthropic.go
transports/bifrost-http/integrations/router.go
core/providers/openai/types.go
core/providers/openai/responses.go
core/providers/anthropic/types.go
core/providers/cohere/responses.go
framework/streaming/responses.go

🧬 Code graph analysis (7)

core/providers/gemini/responses.go (3)

ui/lib/types/logs.ts (6)

FunctionCall (157-160)

ResponsesToolMessage (403-409)

ResponsesMessage (423-438)

ResponsesMessageContent (400-400)

ResponsesReasoning (417-420)

ResponsesReasoningSummary (412-415)

core/providers/gemini/types.go (4)

FunctionCall (1091-1101)

Role (13-13)

Content (922-930)

Type (778-778)

core/schemas/responses.go (5)

ResponsesToolMessage (461-481)

ResponsesMessage (313-326)

ResponsesMessageContent (338-343)

ResponsesReasoning (730-733)

ResponsesReasoningSummary (744-747)

core/providers/vertex/errors.go (4)

core/providers/utils/utils.go (2)

CheckAndDecodeBody (467-475)

NewBifrostOperationError (493-504)

core/schemas/provider.go (1)

ErrProviderResponseDecode (29-29)

core/providers/vertex/vertex.go (1)

VertexError (25-31)

core/providers/vertex/types.go (1)

VertexValidationError (154-161)

core/schemas/responses.go (2)

core/providers/gemini/types.go (1)

Type (778-778)

ui/lib/types/logs.ts (2)

ResponsesMessageContentBlockType (352-359)

ResponsesReasoningSummary (412-415)

transports/bifrost-http/integrations/anthropic.go (3)

core/schemas/bifrost.go (6)

Anthropic (37-37)

Vertex (40-40)

BifrostContextKeyUseRawRequestBody (117-117)

BifrostContextKeyExtraHeaders (115-115)

BifrostContextKeyURLPath (116-116)

BifrostContextKeySkipKeySelection (114-114)

core/schemas/utils.go (1)

IsAnthropicModel (1043-1045)

core/providers/anthropic/responses.go (1)

ToAnthropicResponsesStreamResponse (834-1232)

transports/bifrost-http/integrations/router.go (2)

core/schemas/bifrost.go (1)

BifrostContextKeyIntegrationType (120-120)

core/providers/gemini/types.go (1)

Type (778-778)

core/providers/openai/types.go (2)

core/schemas/responses.go (1)

ResponsesParametersReasoning (233-238)

ui/lib/types/logs.ts (1)

ResponsesParametersReasoning (513-520)

core/providers/openai/responses.go (2)

core/schemas/responses.go (4)

ResponsesMessage (313-326)

ResponsesReasoning (730-733)

ResponsesMessageContentBlock (398-410)

ResponsesMessageContent (338-343)

core/providers/openai/types.go (2)

OpenAIResponsesRequest (143-152)

OpenAIResponsesRequestInput (110-113)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)

GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check

🔇 Additional comments (21)

core/schemas/responses.go (4)

743-747: Type rename from ResponsesReasoningContent to ResponsesReasoningSummary looks correct.

The rename aligns with the UI type definition and the semantic meaning of the struct. This is a breaking change for any external consumers using the old type name.

68-68: LGTM - StopReason field addition.

The StopReason field appropriately accommodates providers that include stop reason information outside OpenAI's spec.

399-402: LGTM - Signature field addition to content blocks.

The Signature field enables per-content signatures for reasoning blocks, consistent with the streaming response structure.

1439-1441: LGTM - Streaming response signature support.

The Signature field addition aligns with the content block changes and properly documents that it's "Not in OpenAI's spec, but sent by other providers."

core/providers/anthropic/types.go (4)

135-145: LGTM - Added redacted_thinking content block type.

The new constant appropriately extends the enum to support Anthropic's redacted/encrypted thinking content blocks.

153-153: LGTM - Data field for redacted thinking.

The Data field properly supports encrypted data for redacted thinking blocks, with the comment clarifying that signature does not accompany this field.

350-350: LGTM - Type field made optional.

Adding omitempty to the Type field in AnthropicStreamDelta is appropriate as the type may not always be present in delta events.

300-305: The concern about omitempty on cache token fields is technically invalid; omitempty has no effect on int types in Go.

Per Go's encoding/json specification, omitempty only affects pointers, slices, maps, interfaces, and strings—not integer types. Integer zero values are always serialized as 0 regardless of the tag. The actual change (using a value type instead of pointer for CacheCreation) aligns with Anthropic's API contract, which always includes the cache_creation object in usage responses, even when empty. The code correctly deserializes and handles these fields per the documented Anthropic API behavior.

Likely an incorrect or invalid review comment.

transports/bifrost-http/integrations/anthropic.go (3)

74-81: LGTM - Extended provider check for Vertex with Anthropic models.

The condition correctly handles the case where Vertex is the provider but the model is an Anthropic/Claude model, using the IsAnthropicModel utility function.

106-122: LGTM - SSE aggregation for multiple streaming events.

The logic correctly handles the case where ToAnthropicResponsesStreamResponse returns multiple events by combining them into a properly formatted SSE string with event: and data: lines.

194-206: LGTM - Refined passthrough gating.

The updated logic properly:

Only sets raw request body for Anthropic or unspecified providers

Conditionally attaches extra headers/URL path only when not using Anthropic API key auth

core/providers/openai/types.go (1)

154-192: Well-structured custom marshaling implementation.

The approach correctly shadows the embedded fields to customize JSON output. The implementation properly:

Marshals Input first using its custom MarshalJSON method

Wraps it in json.RawMessage to preserve the marshaled output

Copies Reasoning with MaxTokens set to nil

This is correct for the OpenAI Responses API, which does not include a max_tokens field in the reasoning parameter. Token limiting is controlled at the request level via max_output_tokens, not within the reasoning configuration. The implementation correctly omits this field by setting it to nil.

core/schemas/bifrost.go (1)

120-120: LGTM!

The addition of BifrostContextKeyIntegrationType follows the existing pattern for context keys and is used appropriately in the router to store integration type information.

transports/bifrost-http/integrations/router.go (3)

312-313: LGTM!

Setting the integration type in the context is clean and follows the established pattern for storing request metadata.

709-712: LGTM!

The updated shouldSendDoneMarker logic correctly distinguishes between providers that expect [DONE] markers and those that don't (Anthropic and the responses API).

883-883: LGTM!

Expanding the SSE string check to allow both "data: " and "event: " prefixes properly supports providers like Anthropic that use custom event types in their SSE format.

framework/streaming/responses.go (3)

498-534: LGTM!

The new ReasoningSummaryTextDelta handling correctly creates or finds reasoning messages and delegates to the new helper functions for accumulation. The logic to find existing messages by ItemID is sound.

626-679: LGTM!

The appendReasoningDeltaToResponsesMessage helper correctly handles both content-block-based reasoning (with ContentIndex) and summary-based reasoning (without ContentIndex). The array bounds checks and initialization logic are appropriate.

681-727: LGTM!

The appendReasoningSignatureToResponsesMessage helper mirrors the delta logic and correctly handles signatures in both content blocks and encrypted content. The implementation is consistent with the delta handler.

core/providers/gemini/responses.go (2)

138-179: LGTM!

The function call handling improvements include:

Proper JSON marshaling of function arguments

Creating local copies to avoid range loop variable capture issues

Correctly initializing the new Summary field when emitting reasoning messages for ThoughtSignature

These changes align with the broader schema updates for reasoning summaries.

609-629: LGTM!

The conversion logic correctly:

Sets the function call name and arguments

Propagates the CallID when present

Preserves ThoughtSignature by looking ahead for reasoning messages with encrypted content

This properly handles Gemini 3 Pro's requirement for ThoughtSignature on function calls.

core/providers/openai/responses.go

transports/bifrost-http/integrations/anthropic.go

coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (9)

core/providers/utils/utils.go (2)

267-267: Past concern remains unaddressed: MarshalIndent increases payload size.

A previous review comment already identified that switching to sonic.MarshalIndent increases the payload size for all provider API requests without documented justification. The code still uses the indented format, and the concern about debug prints and production performance remains valid.

1000-1009: Past suggestions remain unaddressed: Input validation and efficiency improvements.

A previous review comment already provided detailed suggestions for improving this function, including:

Adding length validation to prevent panics for length <= 0

Using a const string for letters instead of recreating []rune each call

Building the result into a []byte buffer instead of []rune

The function works correctly for its current use case (generating cosmetic identifiers), but these improvements would harden it for broader use.
core/providers/anthropic/types.go (1)
356-356: StopSequence should use *string with omitempty for API compatibility.

This concern was raised in a previous review. Changing StopSequence from *string with omitempty to string without omitempty breaks compatibility with Anthropic's API specification. The API returns stop_sequence as either null (in initial streaming events) or a string value. Using a non-pointer string will serialize empty strings as "stop_sequence": "" instead of properly representing the null state.

Apply this diff to restore API compatibility:
-	StopSequence string                   `json:"stop_sequence"`
+	StopSequence *string                  `json:"stop_sequence,omitempty"`
transports/bifrost-http/integrations/anthropic.go (1)
94-105: Remove commented-out code.

This dead code was flagged in a previous review. Remove it to improve maintainability.
 				} else {
-					// if resp.ExtraFields.Provider == schemas.Anthropic ||
-					// 	(resp.ExtraFields.Provider == schemas.Vertex &&
-					// 		(schemas.IsAnthropicModel(resp.ExtraFields.ModelRequested) ||
-					// 			schemas.IsAnthropicModel(resp.ExtraFields.ModelDeployment))) {
-					// 	if resp.ExtraFields.RawResponse != nil {
-					// 		var rawResponseJSON anthropic.AnthropicStreamDelta
-					// 		err := sonic.Unmarshal([]byte(resp.ExtraFields.RawResponse.(string)), &rawResponseJSON)
-					// 		if err == nil {
-					// 			return string(rawResponseJSON.Type), resp.ExtraFields.RawResponse, nil
-					// 		}
-					// 	}
-					// }
 					if len(anthropicResponse) > 1 {
core/providers/cohere/responses.go (5)
140-172: Empty block returned for invalid image may cause downstream issues.

When ImageURL is nil (line 150-152), an empty ResponsesMessageContentBlock{} with zero-value Type is returned. This was flagged in a previous review but the current fix returns an empty block instead of a sentinel value.

Consider returning a properly typed block or filtering at the call site.

319-327: Nil-pointer dereference risk in reasoning item ID generation.

Line 324 dereferences *state.MessageID before the nil check on line 325. This was flagged in a previous review and remains unaddressed.
-				itemID := fmt.Sprintf("msg_%s_reasoning_%d", *state.MessageID, outputIndex)
-				if state.MessageID == nil {
-					itemID = fmt.Sprintf("reasoning_%d", outputIndex)
-				}
+				var itemID string
+				if state.MessageID != nil {
+					itemID = fmt.Sprintf("msg_%s_reasoning_%d", *state.MessageID, outputIndex)
+				} else {
+					itemID = fmt.Sprintf("reasoning_%d", outputIndex)
+				}
1131-1142: Tool choice "auto" incorrectly maps to "required".

This was flagged in a previous review. The "auto" tool choice has different semantics than "required" - auto lets the model decide, while required forces a tool call.

Verify Cohere's tool choice options and map "auto" appropriately (possibly to nil for default behavior).

1216-1225: Encrypted reasoning content exposed in plain text marker.

This was flagged in a previous review. Embedding encrypted content in a [ENCRYPTED_REASONING: ...] marker exposes potentially sensitive data in plain text to Cohere.

Consider skipping encrypted content entirely rather than exposing it.

1244-1246: Guard access to embedded CallID to avoid nil panic.

Accessing msg.CallID when msg.ResponsesToolMessage is nil will panic because CallID is a field on the embedded pointer type.
-	if msg.CallID != nil {
-		toolCall.ID = msg.CallID
-	}
+	if msg.ResponsesToolMessage != nil && msg.ResponsesToolMessage.CallID != nil {
+		toolCall.ID = msg.ResponsesToolMessage.CallID
+	}

🧹 Nitpick comments (2)

transports/bifrost-http/integrations/router.go (1)
709-712: Consider tightening the /responses path check.

The strings.Contains(config.Path, "/responses") check is somewhat broad and could match unintended paths (e.g., a hypothetical /api/user_responses). Consider using a more specific check:
-		if config.Type == RouteConfigTypeAnthropic || strings.Contains(config.Path, "/responses") {
+		if config.Type == RouteConfigTypeAnthropic || strings.HasSuffix(config.Path, "/responses") || strings.Contains(config.Path, "/responses/") {
			shouldSendDoneMarker = false
		}
Alternatively, you could add a dedicated flag to StreamConfig to explicitly control DONE marker behavior.
core/providers/gemini/responses.go (1)
143-145: Inconsistent JSON library usage.

The code uses json.Marshal here while the rest of the file uses sonic for JSON operations. This inconsistency could lead to subtle serialization differences.
-				if argsBytes, err := json.Marshal(part.FunctionCall.Args); err == nil {
+				if argsBytes, err := sonic.Marshal(part.FunctionCall.Args); err == nil {
 					argumentsStr = string(argsBytes)
 				}

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bcef5b4 and e04023a.

📒 Files selected for processing (18)

core/providers/anthropic/chat.go (1 hunks)
core/providers/anthropic/errors.go (2 hunks)
core/providers/anthropic/types.go (4 hunks)
core/providers/cohere/responses.go (9 hunks)
core/providers/gemini/responses.go (2 hunks)
core/providers/openai/responses.go (2 hunks)
core/providers/openai/types.go (2 hunks)
core/providers/utils/utils.go (4 hunks)
core/providers/vertex/errors.go (1 hunks)
core/schemas/bifrost.go (1 hunks)
core/schemas/responses.go (5 hunks)
framework/streaming/responses.go (2 hunks)
transports/bifrost-http/handlers/inference.go (1 hunks)
transports/bifrost-http/integrations/anthropic.go (4 hunks)
transports/bifrost-http/integrations/router.go (3 hunks)
ui/app/workspace/logs/views/logResponsesMessageView.tsx (1 hunks)
ui/app/workspace/logs/views/logResponsesOutputView.tsx (0 hunks)
ui/lib/types/logs.ts (2 hunks)

💤 Files with no reviewable changes (1)

ui/app/workspace/logs/views/logResponsesOutputView.tsx

🚧 Files skipped from review as they are similar to previous changes (6)

core/providers/vertex/errors.go
core/providers/openai/responses.go
ui/app/workspace/logs/views/logResponsesMessageView.tsx
core/providers/anthropic/errors.go
ui/lib/types/logs.ts
transports/bifrost-http/handlers/inference.go

🧰 Additional context used

📓 Path-based instructions (1)

**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

core/schemas/bifrost.go
core/schemas/responses.go
core/providers/gemini/responses.go
core/providers/anthropic/chat.go
transports/bifrost-http/integrations/anthropic.go
core/providers/anthropic/types.go
core/providers/openai/types.go
transports/bifrost-http/integrations/router.go
framework/streaming/responses.go
core/providers/cohere/responses.go
core/providers/utils/utils.go

🧬 Code graph analysis (7)

core/schemas/responses.go (2)

core/providers/gemini/types.go (1)

Type (778-778)

ui/lib/types/logs.ts (2)

ResponsesMessageContentBlockType (352-359)

ResponsesReasoningSummary (412-415)

core/providers/gemini/responses.go (2)

core/providers/gemini/types.go (5)

FunctionCall (1091-1101)

Role (13-13)

Content (922-930)

Type (778-778)

Part (936-960)

core/schemas/responses.go (5)

ResponsesToolMessage (461-481)

ResponsesMessage (313-326)

ResponsesMessageContent (338-343)

ResponsesReasoning (730-733)

ResponsesReasoningSummary (744-747)

transports/bifrost-http/integrations/anthropic.go (3)

core/schemas/bifrost.go (5)

Anthropic (37-37)

Vertex (40-40)

BifrostContextKeyUseRawRequestBody (117-117)

BifrostContextKeyExtraHeaders (115-115)

BifrostContextKeyURLPath (116-116)

core/schemas/utils.go (1)

IsAnthropicModel (1043-1045)

core/providers/anthropic/responses.go (1)

ToAnthropicResponsesStreamResponse (794-1192)

core/providers/openai/types.go (2)

core/schemas/responses.go (1)

ResponsesParametersReasoning (233-238)

ui/lib/types/logs.ts (1)

ResponsesParametersReasoning (513-520)

transports/bifrost-http/integrations/router.go (2)

core/schemas/bifrost.go (1)

BifrostContextKeyIntegrationType (120-120)

core/providers/gemini/types.go (1)

Type (778-778)

framework/streaming/responses.go (3)

core/schemas/responses.go (6)

ResponsesStreamResponseTypeReasoningSummaryTextDelta (1392-1392)

ResponsesMessage (313-326)

ResponsesReasoning (730-733)

ResponsesReasoningSummary (744-747)

ResponsesMessageContent (338-343)

ResponsesMessageContentBlock (398-410)

core/schemas/utils.go (1)

Ptr (16-18)

framework/streaming/accumulator.go (1)

Accumulator (14-30)

core/providers/utils/utils.go (1)

core/schemas/bifrost.go (2)

BifrostError (356-365)

ErrorField (374-381)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)

GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check

🔇 Additional comments (26)

core/providers/utils/utils.go (2)

10-10: LGTM: Import supports the new utility function.

The math/rand import is required for the GetRandomString function added below. Since the function generates cosmetic identifiers (not security-sensitive tokens), this import is appropriate.

322-334: LGTM: Good defensive addition for compressed error responses.

Adding CheckAndDecodeBody before unmarshaling ensures that compressed (e.g., gzip) error responses from provider APIs are properly decoded. The error handling correctly returns a BifrostError with status code if decoding fails, and the decoded body is appropriately used for subsequent unmarshaling.

core/providers/anthropic/types.go (1)

135-145: LGTM - New content block type for redacted thinking.

The addition of AnthropicContentBlockTypeRedactedThinking aligns with Anthropic's extended thinking feature where thinking blocks may be redacted. The Data field addition on line 153 properly supports encrypted data for these redacted thinking blocks.

core/schemas/bifrost.go (1)

120-120: LGTM - New context key for integration type.

The new BifrostContextKeyIntegrationType constant enables routing logic to identify the integration type (OpenAI, Anthropic, etc.) in the request context. This supports the conditional DONE marker behavior in handleStreaming.

core/providers/openai/types.go (1)

154-192: LGTM - Custom marshaling excludes MaxTokens for OpenAI.

The MarshalJSON implementation correctly strips MaxTokens from the Reasoning field before serialization, as OpenAI's Responses API doesn't support this parameter (it's Anthropic-specific per the schema comment). The approach of manually copying fields ensures the original request remains unchanged.

Note: If ResponsesParametersReasoning gains new fields in the future, this method will need to be updated to copy them as well.

transports/bifrost-http/integrations/router.go (2)

312-313: LGTM - Integration type stored in context.

Setting the integration type in the context enables downstream logic to conditionally handle provider-specific behaviors like DONE marker emission.

883-885: LGTM - Extended SSE prefix handling.

The condition now correctly handles both "data: " and "event: " prefixed strings, allowing providers that return complete SSE-formatted strings to pass through without double-wrapping.

transports/bifrost-http/integrations/anthropic.go (3)

74-82: LGTM - Extended provider check for Vertex with Anthropic models.

The condition now correctly handles both direct Anthropic requests and Vertex requests using Anthropic models (claude-*), returning raw responses when available.

106-122: LGTM - Multi-event aggregation for streaming responses.

The logic correctly handles cases where ToAnthropicResponsesStreamResponse returns multiple events by aggregating them into a single SSE-formatted string with proper event: and data: lines. Single events are returned directly for more efficient handling.

194-206: Empty provider assumption and OAuth key skipping are correct.

The code's assumption that provider == "" means Anthropic passthrough is reasonable given this is the /anthropic/v1/messages endpoint. The BifrostContextKeySkipKeySelection flag is intentionally set for OAuth flows (detected by the Bearer sk-ant-oat* token in isAnthropicAPIKeyAuth), not API key auth. Anthropic is in the allowed list for key skipping (unlike Azure, Bedrock, and Vertex), so passing an empty key to the provider for OAuth flows is the intended behavior and is properly guarded.

core/providers/anthropic/chat.go (1)

608-634: No issues found with empty PartialJSON handling.

The change to emit tool input deltas whenever PartialJSON is non-nil is safe. Downstream code in framework/streaming/accumulator.go explicitly handles empty string Arguments through string concatenation (line 267), which safely accumulates empty strings without issues. The accumulator also includes special handling for edge cases like empty braces (line 247-248), confirming the code is prepared for empty Arguments values during streaming aggregation.

core/providers/gemini/responses.go (3)

148-164: LGTM - Good defensive copy pattern.

The code correctly creates local copies of functionCallID and functionCallName to avoid potential issues with range variable capture when these values are used in pointers.

166-179: ThoughtSignature preservation for Gemini 3 Pro looks correct.

The logic to emit a separate ResponsesReasoning message when ThoughtSignature is present ensures the signature can be round-tripped. The Summary field is correctly initialized as an empty slice.

619-627: The look-ahead logic is correct; the reasoning message is always emitted immediately after the function call.

In convertGeminiCandidatesToResponsesOutput, when a function call part with a ThoughtSignature is processed, the reasoning message is appended directly after the function call message within the same case block (lines 167–178). There is no opportunity for intervening messages between them, as the loop processes individual parts and appends complete function-call-plus-reasoning pairs sequentially to the messages array.

framework/streaming/responses.go (3)

497-534: LGTM - Well-structured reasoning delta handling.

The new ReasoningSummaryTextDelta case correctly:

Guards against nil Delta/Signature with ItemID check

Searches backwards for existing message by ID

Creates new reasoning message if not found

Handles both text delta and signature delta

626-679: Clear dual-path logic for reasoning delta accumulation.

The helper correctly branches on contentIndex:

With index: accumulates into content blocks (reasoning_text type)

Without index: accumulates into ResponsesReasoning.Summary

The comment on lines 667-668 acknowledges future extensibility for multiple summary entries.

681-727: Signature helper mirrors delta helper pattern.

The appendReasoningSignatureToResponsesMessage follows the same dual-path logic as the delta helper, storing signatures either in content blocks (Signature field) or in ResponsesReasoning.EncryptedContent. This is consistent with the schema design.

core/schemas/responses.go (4)

68-68: LGTM - StopReason field addition.

The StopReason field is properly documented as not part of OpenAI's spec but needed for other providers. The omitempty tag ensures it won't appear in responses when not set.

399-402: Signature field enables reasoning content signing.

The new Signature field on ResponsesMessageContentBlock supports the reasoning signature streaming feature added in the streaming layer. The field ordering and JSON tag are correct.

729-747: ResponsesReasoning schema update aligns with streaming changes.

The Summary field now uses []ResponsesReasoningSummary with the new struct definition. This aligns with:

The streaming helper that appends to Summary[0].Text

The UI type definition (ResponsesReasoningSummary with type: "summary_text")

The Gemini conversion that initializes Summary: []schemas.ResponsesReasoningSummary{}

1439-1441: Stream response Signature field added.

The Signature field on BifrostResponsesStreamResponse enables streaming reasoning signatures, used by the appendReasoningSignatureToResponsesMessage helper. The comment correctly notes this is not in OpenAI's spec.
core/providers/cohere/responses.go (5)
17-17: LGTM - ReasoningContentIndices state tracking.

The new ReasoningContentIndices map correctly tracks which content indices are reasoning blocks, enabling proper event emission (reasoning vs text) during streaming. Initialization and cleanup follow the established pattern for other state maps.

Also applies to: 34-34, 64-68, 106-110

850-932: ToCohereResponsesRequest implementation looks correct.

The conversion properly:

Maps basic parameters (MaxOutputTokens, Temperature, TopP)

Extracts extra params (top_k, stop, frequency_penalty, presence_penalty, thinking)

Converts tools and tool choice

Delegates message conversion to ConvertBifrostMessagesToCohereMessages

977-1112: ConvertBifrostMessagesToCohereMessages handles complex message flows.

The function correctly:

Collects system messages separately

Tracks pending reasoning blocks to attach to assistant messages

Handles function calls and outputs

Flushes pending state at message boundaries

The logic for accumulating reasoning blocks before assistant content is particularly well-structured.

1303-1429: convertSingleCohereMessageToBifrostMessages comprehensive conversion.

The function properly:

Separates reasoning blocks from regular content

Prepends reasoning message to output

Handles tool calls with nil safety checks (lines 1389-1392)

Generates stable IDs using timestamps

505-511: Same nil-dereference pattern in tool plan ID generation.

Lines 507-511 have the same issue - dereferencing state.MessageID before checking for nil.
-				var itemID string
-				if state.MessageID == nil {
-					itemID = fmt.Sprintf("item_%d", outputIndex)
-				} else {
-					itemID = fmt.Sprintf("msg_%s_item_%d", *state.MessageID, outputIndex)
-				}
+				var itemID string
+				if state.MessageID != nil {
+					itemID = fmt.Sprintf("msg_%s_item_%d", *state.MessageID, outputIndex)
+				} else {
+					itemID = fmt.Sprintf("item_%d", outputIndex)
+				}
Likely an incorrect or invalid review comment.

core/providers/anthropic/types.go

core/providers/cohere/responses.go

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (4)

core/providers/openai/responses.go (1)
57-59: Redundant condition check remains unfixed.

Line 59 duplicates the len(message.ResponsesReasoning.Summary) > 0 check already performed on line 57.

Apply this diff:
 		// If the message has summaries but no content blocks and the model is gpt-oss, then convert the summaries to content blocks
 		if len(message.ResponsesReasoning.Summary) > 0 &&
 			strings.Contains(bifrostReq.Model, "gpt-oss") &&
-			len(message.ResponsesReasoning.Summary) > 0 &&
 			message.Content == nil {
core/providers/anthropic/types.go (2)
299-305: Verify cache field serialization aligns with Anthropic's API expectations.

The removal of omitempty tags from CacheCreationInputTokens, CacheReadInputTokens, and the change of CacheCreation from pointer to value type means zero values will serialize as explicit JSON fields (e.g., "cache_creation_input_tokens": 0) instead of being omitted. Confirm this serialization behavior is intentional and won't break downstream clients expecting omitted fields when cache is unused.

356-356: StopSequence should use *string with omitempty for API compatibility.

Changing StopSequence from *string with omitempty to string without omitempty breaks compatibility with Anthropic's API specification. The API returns stop_sequence as either null (in initial streaming events) or a string value (the matched stop sequence). Using a non-pointer string type will serialize empty strings as "stop_sequence": "" instead of properly representing the null state.
-	StopSequence string                   `json:"stop_sequence"`
+	StopSequence *string                  `json:"stop_sequence,omitempty"`
transports/bifrost-http/integrations/anthropic.go (1)
94-105: Remove commented-out dead code.

This block appears to be dead code from a previous implementation. It should be removed to improve maintainability.
 				} else {
-					// if resp.ExtraFields.Provider == schemas.Anthropic ||
-					// 	(resp.ExtraFields.Provider == schemas.Vertex &&
-					// 		(schemas.IsAnthropicModel(resp.ExtraFields.ModelRequested) ||
-					// 			schemas.IsAnthropicModel(resp.ExtraFields.ModelDeployment))) {
-					// 	if resp.ExtraFields.RawResponse != nil {
-					// 		var rawResponseJSON anthropic.AnthropicStreamDelta
-					// 		err := sonic.Unmarshal([]byte(resp.ExtraFields.RawResponse.(string)), &rawResponseJSON)
-					// 		if err == nil {
-					// 			return string(rawResponseJSON.Type), resp.ExtraFields.RawResponse, nil
-					// 		}
-					// 	}
-					// }
 					if len(anthropicResponse) > 1 {

🧹 Nitpick comments (6)

ui/app/workspace/logs/views/columns.tsx (1)
40-44: Transcription logs now lose any per-request prompt/context

Switching the transcription_input branch to always return "Audio file" simplifies the UI but drops any prompt or other contextual text that might have been attached to the transcription request, which can make debugging harder when scanning logs.

If that context is still useful and not too noisy, consider preserving it as a fallback (or at least in the tooltip) while keeping the short label in the cell:
} else if (log?.transcription_input) {
  return log.transcription_input.prompt?.trim() || "Audio file";
}
Or, if you want the cell body to stay generic, you could keep "Audio file" here and surface prompt only in the title attribute for this row’s message cell.
core/providers/openai/utils.go (1)

46-56: Implementation is correct; consider optional observability.

The sanitization logic correctly enforces OpenAI's 64-character limit. However, silently dropping the User field may make debugging difficult when requests unexpectedly lack user tracking.

Consider adding optional structured logging when the field is dropped, or documenting this behavior clearly for API consumers who may rely on user tracking.
core/providers/openai/responses.go (1)
42-85: Consider simplifying the nested conditionals for readability.

The reasoning block transformation logic is correct but the nested conditions make it harder to follow. The logic handles three cases:

Skip reasoning-only messages without summaries/encrypted content for non-gpt-oss models

Convert summaries to content blocks for gpt-oss models

Pass through all other messages unchanged

Consider extracting the gpt-oss check and message transformation into a helper function:
func shouldSkipReasoningMessage(message schemas.ResponsesMessage, model string) bool {
    if message.ResponsesReasoning == nil {
        return false
    }
    return len(message.ResponsesReasoning.Summary) == 0 &&
        message.Content != nil &&
        len(message.Content.ContentBlocks) > 0 &&
        !strings.Contains(model, "gpt-oss") &&
        message.ResponsesReasoning.EncryptedContent == nil
}
transports/bifrost-http/integrations/anthropic.go (2)
7-7: Use the injected logger instead of the standard log package.

The GenericRouter has a logger schemas.Logger field, but this file uses the standard log package at line 112. This creates inconsistent logging behavior. However, since this is inside a closure that doesn't have direct access to the logger, consider passing the logger through the context or refactoring to maintain consistent logging.

118-122: Remove unreachable else clause.

The else clause at lines 120-122 is unreachable. At this point in the code, len(anthropicResponse) >= 1 is guaranteed because len == 0 already returns at lines 91-92. The conditions len > 1 and len == 1 cover all remaining cases.
 						if len(anthropicResponse) > 1 {
 							combinedContent := ""
 							for _, event := range anthropicResponse {
 								responseJSON, err := sonic.Marshal(event)
 								if err != nil {
 									// Log JSON marshaling error but continue processing (should not happen)
 									log.Printf("Failed to marshal streaming response: %v", err)
 									continue
 								}
 								combinedContent += fmt.Sprintf("event: %s\ndata: %s\n\n", event.Type, responseJSON)
 							}
 							return "", combinedContent, nil
-						} else if len(anthropicResponse) == 1 {
-							return string(anthropicResponse[0].Type), anthropicResponse[0], nil
-						} else {
-							return "", nil, nil
 						}
+						return string(anthropicResponse[0].Type), anthropicResponse[0], nil
ui/app/workspace/logs/views/logDetailsSheet.tsx (1)
187-237: Consider adding type safety for the reasoning parameter.

The as any type assertion bypasses TypeScript's type checking. If the LogEntry.params type doesn't include the reasoning field, consider extending the type definition rather than using any.

Additionally, the IIFE pattern works but could be simplified by extracting into a separate component:
// Option: Extract to a helper component
function ReasoningParametersSection({ reasoning }: { reasoning: Record<string, unknown> }) {
  if (!reasoning || typeof reasoning !== "object" || Object.keys(reasoning).length === 0) {
    return null;
  }
  return (
    <>
      <DottedSeparator />
      <div className="space-y-4">
        {/* ... rest of the rendering */}
      </div>
    </>
  );
}

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e04023a and bb6c8dc.

⛔ Files ignored due to path filters (1)

ui/package-lock.json is excluded by !**/package-lock.json

📒 Files selected for processing (24)

core/internal/testutil/account.go (1 hunks)
core/providers/anthropic/types.go (4 hunks)
core/providers/bedrock/bedrock_test.go (15 hunks)
core/providers/bedrock/utils.go (1 hunks)
core/providers/openai/chat.go (1 hunks)
core/providers/openai/responses.go (2 hunks)
core/providers/openai/text.go (1 hunks)
core/providers/openai/types.go (3 hunks)
core/providers/openai/utils.go (1 hunks)
core/providers/utils/utils.go (4 hunks)
core/providers/vertex/errors.go (1 hunks)
core/schemas/bifrost.go (1 hunks)
framework/streaming/audio.go (1 hunks)
framework/streaming/chat.go (1 hunks)
framework/streaming/responses.go (4 hunks)
framework/streaming/transcription.go (1 hunks)
transports/bifrost-http/handlers/middlewares.go (1 hunks)
transports/bifrost-http/integrations/anthropic.go (4 hunks)
transports/bifrost-http/integrations/router.go (3 hunks)
ui/app/workspace/logs/views/columns.tsx (1 hunks)
ui/app/workspace/logs/views/logDetailsSheet.tsx (1 hunks)
ui/app/workspace/logs/views/logResponsesMessageView.tsx (2 hunks)
ui/app/workspace/logs/views/logResponsesOutputView.tsx (0 hunks)
ui/package.json (1 hunks)

💤 Files with no reviewable changes (1)

ui/app/workspace/logs/views/logResponsesOutputView.tsx

✅ Files skipped from review due to trivial changes (1)

transports/bifrost-http/handlers/middlewares.go

🚧 Files skipped from review as they are similar to previous changes (1)

ui/app/workspace/logs/views/logResponsesMessageView.tsx

🧰 Additional context used

📓 Path-based instructions (1)

**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

core/providers/bedrock/utils.go
ui/app/workspace/logs/views/columns.tsx
core/internal/testutil/account.go
core/providers/vertex/errors.go
core/providers/openai/types.go
core/providers/openai/responses.go
ui/package.json
core/providers/openai/utils.go
core/providers/anthropic/types.go
framework/streaming/chat.go
core/providers/openai/chat.go
framework/streaming/responses.go
core/schemas/bifrost.go
transports/bifrost-http/integrations/anthropic.go
transports/bifrost-http/integrations/router.go
framework/streaming/transcription.go
core/providers/openai/text.go
framework/streaming/audio.go
core/providers/utils/utils.go
core/providers/bedrock/bedrock_test.go
ui/app/workspace/logs/views/logDetailsSheet.tsx

🧬 Code graph analysis (11)

core/providers/vertex/errors.go (4)

core/providers/utils/utils.go (2)

CheckAndDecodeBody (467-475)

NewBifrostOperationError (493-504)

core/schemas/provider.go (1)

ErrProviderResponseDecode (29-29)

core/providers/vertex/vertex.go (1)

VertexError (25-31)

core/providers/vertex/types.go (1)

VertexValidationError (154-161)

core/providers/openai/types.go (3)

core/schemas/chatcompletions.go (1)

ChatParameters (155-184)

core/schemas/responses.go (1)

ResponsesParametersReasoning (233-238)

ui/lib/types/logs.ts (1)

ResponsesParametersReasoning (513-520)

core/providers/openai/responses.go (3)

core/schemas/responses.go (6)

ResponsesMessage (313-326)

ResponsesReasoning (730-733)

ResponsesMessageContentBlock (398-410)

ResponsesOutputMessageContentTypeReasoning (393-393)

ResponsesMessageContent (338-343)

ResponsesParameters (86-113)

core/providers/openai/types.go (2)

OpenAIResponsesRequest (176-185)

OpenAIResponsesRequestInput (143-146)

core/providers/openai/utils.go (1)

SanitizeUserField (51-56)

core/providers/openai/chat.go (2)

core/schemas/chatcompletions.go (1)

ChatParameters (155-184)

core/providers/openai/utils.go (1)

SanitizeUserField (51-56)

framework/streaming/responses.go (2)

core/schemas/responses.go (10)

ResponsesStreamResponseTypeReasoningSummaryTextDelta (1392-1392)

ResponsesMessage (313-326)

ResponsesMessageTypeReasoning (306-306)

ResponsesInputMessageRoleAssistant (331-331)

ResponsesReasoning (730-733)

ResponsesReasoningSummary (744-747)

ResponsesMessageContent (338-343)

ResponsesMessageContentBlock (398-410)

ResponsesOutputMessageContentTypeReasoning (393-393)

ResponsesReasoningContentBlockTypeSummaryText (740-740)

core/schemas/utils.go (1)

Ptr (16-18)

transports/bifrost-http/integrations/anthropic.go (4)

core/schemas/provider.go (1)

Provider (282-309)

core/schemas/bifrost.go (5)

Anthropic (37-37)

Vertex (40-40)

BifrostContextKeyExtraHeaders (115-115)

BifrostContextKeyURLPath (116-116)

BifrostContextKeySkipKeySelection (114-114)

core/schemas/utils.go (1)

IsAnthropicModel (1043-1045)

core/providers/anthropic/responses.go (1)

ToAnthropicResponsesStreamResponse (771-1158)

transports/bifrost-http/integrations/router.go (1)

core/schemas/bifrost.go (1)

BifrostContextKeyIntegrationType (120-120)

core/providers/openai/text.go (2)

core/schemas/textcompletions.go (1)

TextCompletionParameters (120-140)

core/providers/openai/utils.go (1)

SanitizeUserField (51-56)

core/providers/utils/utils.go (2)

core/schemas/bifrost.go (2)

BifrostError (356-365)

ErrorField (374-381)

ui/lib/types/logs.ts (2)

BifrostError (226-232)

ErrorField (217-224)

core/providers/bedrock/bedrock_test.go (5)

core/schemas/utils.go (1)

Ptr (16-18)

core/schemas/responses.go (4)

ResponsesMessageTypeMessage (289-289)

ResponsesInputMessageRoleUser (332-332)

ResponsesInputMessageRoleAssistant (331-331)

ResponsesInputMessageRoleSystem (333-333)

core/schemas/chatcompletions.go (1)

OrderedMap (268-268)

core/providers/bedrock/responses.go (2)

ToBedrockResponsesRequest (1387-1536)

ToolResult (1786-1791)

core/providers/bedrock/types.go (2)

BedrockMessageRoleAssistant (68-68)

BedrockMessageRoleUser (67-67)

ui/app/workspace/logs/views/logDetailsSheet.tsx (3)

ui/components/ui/separator.tsx (1)

DottedSeparator (43-43)

ui/app/workspace/logs/views/logEntryDetailsView.tsx (1)

LogEntryDetailsView (15-49)

ui/components/ui/badge.tsx (1)

Badge (37-37)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Graphite / mergeability_check
GitHub Check: Graphite / mergeability_check

🔇 Additional comments (31)

core/providers/openai/types.go (2)

104-135: Well-structured custom unmarshalling to handle embedded struct conflict.

The implementation correctly addresses the issue where ChatParameters' custom UnmarshalJSON would otherwise hijack the entire unmarshalling process. The two-pass approach (base fields, then embedded parameters) is the right pattern.

187-225: Custom marshalling correctly handles Input and Reasoning field transformation.

The implementation properly:

Preserves Input's custom marshalling via json.RawMessage

Shadows Reasoning to set MaxTokens to nil for OpenAI compatibility

Note: encoding/json.RawMessage is used as a type container while sonic.Marshal performs the actual serialization—this is correct since RawMessage is just []byte.

core/providers/openai/text.go (1)

19-20: LGTM!

User field sanitization is correctly applied after copying parameters, ensuring the original bifrostReq.Params remains unchanged. This is consistent with the sanitization pattern in chat.go.

core/providers/openai/chat.go (1)

33-34: LGTM!

User field sanitization is correctly applied after copying parameters and before provider-specific filtering. The sanitization appropriately affects all provider paths that use this conversion function.

core/providers/openai/responses.go (1)

98-99: LGTM!

User field sanitization is consistent with chat.go and text.go implementations.

core/providers/utils/utils.go (1)

322-334: LGTM! Proper decoding before unmarshalling.

The addition of CheckAndDecodeBody correctly handles compressed (e.g., gzip) response bodies before unmarshalling. The error handling path returns an appropriate operation error if decoding fails.

core/providers/vertex/errors.go (1)

13-27: LGTM! Consistent decoding pattern across all error formats.

The addition of CheckAndDecodeBody properly handles compressed responses before attempting to unmarshal into any of the supported error formats (OpenAI, Vertex, VertexValidationError). All unmarshalling paths consistently use decodedBody, and the error handling is appropriate.

core/internal/testutil/account.go (1)

139-142: Bedrock Claude 4.5 Haiku mapping is correctly integrated and actively used

The new deployment entry for "claude-4.5-haiku" is used in core/providers/bedrock/bedrock_test.go (lines 1709, 1724) where it appears in test output strings as bedrock/claude-4.5-haiku. The model ID global.anthropic.claude-haiku-4-5-20251001-v1:0 follows the same naming pattern as the surrounding "claude-sonnet-4" and "claude-3.7-sonnet" mappings, maintaining consistency in the Deployments configuration.

No changes needed; this mapping is correctly configured and in active use.

framework/streaming/chat.go (1)

280-282: LGTM! Proper nil-safety guard for RawRequest access.

The triple nil-check ensures safe access to RawRequest and is consistent with the same pattern applied in audio.go and transcription.go.

ui/package.json (1)

46-46: Verify caret range behavior for 0.x versions.

For packages with major version 0, the caret (^0.542.0) only allows patch-level updates (0.542.x), not minor updates. This is typically safe, but be aware that some 0.x packages don't strictly follow semver for patches. Consider pinning exact versions for production stability if you encounter breaking changes.

framework/streaming/audio.go (1)

171-173: LGTM! Consistent nil-safety guard across streaming paths.

The triple nil-check mirrors the pattern in chat.go and transcription.go, ensuring safe access to RawRequest in the audio streaming path.

framework/streaming/transcription.go (1)

185-187: LGTM! Completes nil-safety pattern across all streaming paths.

This nil-check is consistent with chat.go and audio.go, ensuring safe access to RawRequest in the transcription streaming path.

core/schemas/bifrost.go (1)

120-121: LGTM! New context key follows established conventions.

The new BifrostContextKeyIntegrationType constant follows the existing naming pattern and enables integration type propagation through the context.

transports/bifrost-http/integrations/router.go (3)

312-314: LGTM! Integration type propagation via context.

This correctly stores the integration type in the bifrost context using the new BifrostContextKeyIntegrationType constant, enabling downstream handlers to access the route configuration type.

709-712: LGTM! Correct DONE marker suppression for Anthropic and responses API.

The logic correctly disables the [DONE] marker for:

Anthropic integration (uses event-based termination like message_stop)

OpenAI responses API (terminates by closing the stream)

This aligns with each provider's streaming specification.

883-885: SSE string prefix detection handles both custom and standard formats.

The condition correctly identifies pre-formatted SSE strings (starting with "data: " or "event: ") and wraps other strings with the standard "data: " prefix. This supports providers like Anthropic that return custom SSE event formats.

core/providers/bedrock/bedrock_test.go (6)

364-364: Test fixtures updated with explicit Type field.

The addition of Type: schemas.Ptr(schemas.ResponsesMessageTypeMessage) across test cases aligns with schema changes and makes the message type explicit.

700-700: Status changed from "in_progress" to "completed".

This reflects the correct status for function call messages that have been processed. The semantic change from "in_progress" to "completed" aligns with the actual state of completed tool calls.

742-743: Simplified tool output representation.

Using ResponsesToolCallOutputStr directly instead of content blocks simplifies the test fixture and reflects the expected output structure.

665-668: Type corrected to OrderedMap.

The change from map[string]interface{} to schemas.OrderedMap aligns with the actual type definition, ensuring test fixtures match production types.

1375-1390: Improved assertion strategy for generated IDs.

Structure-based comparison instead of exact equality is the correct approach when IDs or timestamps are generated at runtime. This makes tests more robust and less brittle.

1646-1897: Valuable regression test for interleaved tool calls bug.

TestInterleavedToolCallsWithAssistantMessage is a comprehensive integration test that:

Reproduces the exact scenario from the bug report

Validates that tool_result counts don't exceed tool_use counts

Tests complex message interleaving with assistant text between tool batches

This is excellent defensive testing that will prevent regression of the "toolResult blocks exceeds toolUse blocks" error.

core/providers/bedrock/utils.go (1)

15-18: No concerns with the Params optional behavior change.

The function now returns nil when bifrostReq.Params is nil, making parameters optional. This is safe: the single call site at core/providers/bedrock/chat.go:38 properly handles the error return value, and returning nil (indicating no error) is backward compatible. The change makes the API more lenient without breaking existing callers.

core/providers/anthropic/types.go (2)

143-144: LGTM!

The new AnthropicContentBlockTypeRedactedThinking constant follows the existing pattern and aligns with Anthropic's API for handling redacted thinking content blocks.

153-153: LGTM!

The new Data field appropriately supports encrypted data for redacted thinking content blocks and follows the existing optional field pattern with pointer type and omitempty.

transports/bifrost-http/integrations/anthropic.go (2)

74-81: LGTM!

The broadened provider check correctly handles Anthropic models on Vertex by using schemas.IsAnthropicModel to check both ModelRequested and ModelDeployment. This ensures raw responses are passed through for Anthropic-compatible models regardless of the underlying provider.

193-206: LGTM!

The updated passthrough logic correctly handles cases where the provider is explicitly Anthropic or unspecified (empty string). This allows proper header forwarding and OAuth passthrough for Anthropic-compatible requests.

framework/streaming/responses.go (4)

498-534: LGTM!

The new ReasoningSummaryTextDelta case follows the established pattern for delta handling. The backward search for matching ItemID is efficient, and the fallback to create a new reasoning message with proper type and role is correct. Handling both Delta and Signature independently allows for flexible streaming scenarios.

626-679: LGTM!

The helper correctly handles two accumulation paths: content blocks (when contentIndex is provided) and summary-level accumulation. The pattern mirrors existing delta helpers like appendTextDeltaToResponsesMessage. The comment at lines 667-668 acknowledges the single-summary-entry limitation for future enhancement.

681-727: LGTM!

The signature helper correctly mirrors the delta helper structure. It properly stores signatures either in content block Signature fields (when contentIndex is provided) or in ResponsesReasoning.EncryptedContent for summary-level storage, aligning with the schema definitions.

865-866: LGTM!

The added nil checks prevent potential nil pointer dereferences when accessing RawRequest. The same defensive pattern at line 1008 ensures consistency across both code paths.

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

core/providers/openai/responses.go (1)

107-134: Remove unsupported tool types from the hardcoded list.

The code includes LocalShell, Custom, and WebSearchPreview which are not part of OpenAI's official Responses API tool support as of December 2025. OpenAI's documented tools are: Web search, File search, Computer use, Code Interpreter, Image generation, and Remote MCP servers. Either remove these unsupported types or clarify if they represent codebase-specific abstractions that require separate validation logic.

♻️ Duplicate comments (3)

core/providers/utils/utils.go (1)
267-271: Prefer compact JSON over MarshalIndent for request bodies

Using sonic.MarshalIndent here increases payload size for every provider request without functional benefit; pretty-printing is usually only needed in logs or tests. Unless an upstream API strictly requires indented JSON, consider reverting to sonic.Marshal(convertedBody) and handling any readability needs separately.
-		jsonBody, err := sonic.MarshalIndent(convertedBody, "", "  ")
+		jsonBody, err := sonic.Marshal(convertedBody)
If indented JSON is actually required by a specific provider, it’d be worth documenting that constraint next to this call.
core/providers/anthropic/types.go (1)

356-356: StopSequence type change was discussed in prior review.

The previous review recommended using *string without omitempty to properly preserve null values from Anthropic's API. The current implementation uses string which converts null to empty string. Since this was extensively discussed and you confirmed the API behavior, I'll defer to your judgment, but note that *string without omitempty would more faithfully represent the distinction between null and empty string if that matters for downstream consumers.
transports/bifrost-http/integrations/anthropic.go (1)
94-105: Remove commented-out dead code.

This commented block was flagged in a previous review and marked as addressed, but it's still present. Remove it to improve maintainability.
 					} else {
-						// if resp.ExtraFields.Provider == schemas.Anthropic ||
-						// 	(resp.ExtraFields.Provider == schemas.Vertex &&
-						// 		(schemas.IsAnthropicModel(resp.ExtraFields.ModelRequested) ||
-						// 			schemas.IsAnthropicModel(resp.ExtraFields.ModelDeployment))) {
-						// 	if resp.ExtraFields.RawResponse != nil {
-						// 		var rawResponseJSON anthropic.AnthropicStreamDelta
-						// 		err := sonic.Unmarshal([]byte(resp.ExtraFields.RawResponse.(string)), &rawResponseJSON)
-						// 		if err == nil {
-						// 			return string(rawResponseJSON.Type), resp.ExtraFields.RawResponse, nil
-						// 		}
-						// 	}
-						// }
 						if len(anthropicResponse) > 1 {

🧹 Nitpick comments (4)

ui/app/workspace/logs/views/logResponsesMessageView.tsx (1)

202-204: Skip-empty reasoning guard looks good; align equality style

The early return cleanly avoids rendering empty reasoning-only cards. Minor nit: consider using === for message.type === "reasoning" to match the rest of the file and typical TS linting.
core/providers/cohere/responses.go (2)
1242-1244: Nil-safe access to CallID via explicit check on ResponsesToolMessage.

The previous review flagged potential nil-pointer dereference when accessing msg.CallID through the embedded *ResponsesToolMessage. The code now has:
if msg.ResponsesToolMessage != nil && msg.ResponsesToolMessage.CallID != nil {
    toolCall.ID = msg.CallID
}
This is safer, though note that line 1243 still accesses msg.CallID (the promoted field) rather than msg.ResponsesToolMessage.CallID. This works because the outer check ensures ResponsesToolMessage is not nil, but for clarity and consistency, consider using the explicit path.
 if msg.ResponsesToolMessage != nil && msg.ResponsesToolMessage.CallID != nil {
-    toolCall.ID = msg.CallID
+    toolCall.ID = msg.ResponsesToolMessage.CallID
 }
500-501: Duplicate comment in tool plan delta handling.

Lines 500-501 contain a duplicated comment:
// Generate stable ID for text item
// Generate stable ID for text item
-			// Generate stable ID for text item
-			// Generate stable ID for text item
+			// Generate stable ID for text item
 			var itemID string
transports/bifrost-http/integrations/anthropic.go (1)

110-113: Consider using the structured logger instead of log.Printf.

The GenericRouter receives a schemas.Logger that could provide structured logging with context. Using log.Printf here bypasses any configured logging infrastructure.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bb6c8dc and 68bd009.

⛔ Files ignored due to path filters (1)

ui/package-lock.json is excluded by !**/package-lock.json

📒 Files selected for processing (28)

core/internal/testutil/account.go (1 hunks)
core/providers/anthropic/chat.go (1 hunks)
core/providers/anthropic/errors.go (2 hunks)
core/providers/anthropic/types.go (4 hunks)
core/providers/bedrock/bedrock_test.go (15 hunks)
core/providers/bedrock/utils.go (1 hunks)
core/providers/cohere/responses.go (9 hunks)
core/providers/gemini/responses.go (2 hunks)
core/providers/openai/chat.go (1 hunks)
core/providers/openai/responses.go (2 hunks)
core/providers/openai/text.go (1 hunks)
core/providers/openai/types.go (3 hunks)
core/providers/openai/utils.go (1 hunks)
core/providers/utils/utils.go (4 hunks)
core/providers/vertex/errors.go (1 hunks)
core/schemas/bifrost.go (1 hunks)
core/schemas/responses.go (5 hunks)
framework/streaming/responses.go (3 hunks)
transports/bifrost-http/handlers/inference.go (1 hunks)
transports/bifrost-http/handlers/middlewares.go (1 hunks)
transports/bifrost-http/integrations/anthropic.go (4 hunks)
transports/bifrost-http/integrations/router.go (3 hunks)
ui/app/workspace/logs/views/columns.tsx (1 hunks)
ui/app/workspace/logs/views/logDetailsSheet.tsx (1 hunks)
ui/app/workspace/logs/views/logResponsesMessageView.tsx (2 hunks)
ui/app/workspace/logs/views/logResponsesOutputView.tsx (0 hunks)
ui/lib/types/logs.ts (2 hunks)
ui/package.json (1 hunks)

💤 Files with no reviewable changes (1)

ui/app/workspace/logs/views/logResponsesOutputView.tsx

🚧 Files skipped from review as they are similar to previous changes (7)

transports/bifrost-http/handlers/inference.go
transports/bifrost-http/integrations/router.go
ui/app/workspace/logs/views/logDetailsSheet.tsx
core/schemas/bifrost.go
transports/bifrost-http/handlers/middlewares.go
ui/lib/types/logs.ts
core/internal/testutil/account.go

🧰 Additional context used

📓 Path-based instructions (1)

**

⚙️ CodeRabbit configuration file

always check the stack if there is one for the current PR. do not give localized reviews for the PR, always see all changes in the light of the whole stack of PRs (if there is a stack, if there is no stack you can continue to make localized suggestions/reviews)

Files:

ui/package.json
core/providers/vertex/errors.go
core/providers/openai/chat.go
ui/app/workspace/logs/views/logResponsesMessageView.tsx
core/providers/anthropic/errors.go
core/providers/bedrock/bedrock_test.go
core/providers/openai/responses.go
core/providers/gemini/responses.go
transports/bifrost-http/integrations/anthropic.go
core/providers/anthropic/chat.go
core/providers/openai/text.go
core/providers/anthropic/types.go
core/schemas/responses.go
framework/streaming/responses.go
core/providers/openai/utils.go
core/providers/bedrock/utils.go
core/providers/utils/utils.go
core/providers/openai/types.go
ui/app/workspace/logs/views/columns.tsx
core/providers/cohere/responses.go

🧬 Code graph analysis (9)

core/providers/openai/chat.go (2)

core/schemas/chatcompletions.go (1)

ChatParameters (155-184)

core/providers/openai/utils.go (1)

SanitizeUserField (51-56)

core/providers/anthropic/errors.go (1)

ui/lib/types/logs.ts (1)

BifrostError (226-232)

core/providers/bedrock/bedrock_test.go (5)

core/schemas/utils.go (1)

Ptr (16-18)

core/schemas/responses.go (6)

ResponsesMessageTypeMessage (289-289)

ResponsesInputMessageRoleUser (332-332)

ResponsesInputMessageRoleAssistant (331-331)

ResponsesInputMessageRoleSystem (333-333)

ResponsesMessageTypeFunctionCall (294-294)

BifrostResponsesRequest (32-39)

core/schemas/chatcompletions.go (1)

OrderedMap (268-268)

core/providers/bedrock/responses.go (2)

ToBedrockResponsesRequest (1387-1536)

ToolResult (1786-1791)

core/providers/bedrock/types.go (2)

BedrockMessageRoleAssistant (68-68)

BedrockMessageRoleUser (67-67)

core/providers/openai/responses.go (2)

core/providers/openai/types.go (2)

OpenAIResponsesRequest (176-185)

OpenAIResponsesRequestInput (143-146)

core/providers/openai/utils.go (1)

SanitizeUserField (51-56)

core/providers/openai/text.go (2)

core/schemas/textcompletions.go (1)

TextCompletionParameters (120-140)

core/providers/openai/utils.go (1)

SanitizeUserField (51-56)

framework/streaming/responses.go (3)

core/schemas/responses.go (10)

ResponsesStreamResponseTypeReasoningSummaryTextDelta (1392-1392)

ResponsesMessage (313-326)

ResponsesMessageTypeReasoning (306-306)

ResponsesInputMessageRoleAssistant (331-331)

ResponsesReasoning (730-733)

ResponsesReasoningSummary (744-747)

ResponsesMessageContent (338-343)

ResponsesMessageContentBlock (398-410)

ResponsesOutputMessageContentTypeReasoning (393-393)

ResponsesReasoningContentBlockTypeSummaryText (740-740)

core/schemas/utils.go (1)

Ptr (16-18)

core/utils.go (1)

Ptr (51-53)

core/providers/utils/utils.go (2)

core/schemas/bifrost.go (2)

BifrostError (356-365)

ErrorField (374-381)

ui/lib/types/logs.ts (2)

BifrostError (226-232)

ErrorField (217-224)

core/providers/openai/types.go (2)

core/schemas/chatcompletions.go (1)

ChatParameters (155-184)

core/schemas/responses.go (1)

ResponsesParametersReasoning (233-238)

core/providers/cohere/responses.go (3)

core/providers/cohere/types.go (5)

CohereContentBlock (142-156)

CohereContentBlockTypeText (134-134)

CohereContentBlockTypeImage (135-135)

CohereContentBlockTypeThinking (136-136)

CohereImageURL (159-161)

core/schemas/responses.go (22)

ResponsesMessageContentBlock (398-410)

ResponsesOutputMessageContentTypeText (391-391)

ResponsesInputMessageContentBlockTypeImage (388-388)

ResponsesInputMessageContentBlockImage (412-415)

ResponsesOutputMessageContentTypeReasoning (393-393)

ResponsesInputMessageContentBlockTypeText (387-387)

BifrostResponsesStreamResponse (1426-1465)

BifrostResponsesResponse (45-84)

ResponsesStreamResponseTypeCreated (1362-1362)

ResponsesStreamResponseTypeInProgress (1363-1363)

ResponsesStreamResponseTypeOutputTextDone (1375-1375)

ResponsesStreamResponseTypeContentPartDone (1372-1372)

ResponsesMessage (313-326)

ResponsesMessageContent (338-343)

ResponsesStreamResponseTypeOutputItemAdded (1368-1368)

ResponsesReasoning (730-733)

ResponsesStreamResponseTypeOutputTextDelta (1374-1374)

ResponsesStreamResponseTypeReasoningSummaryTextDelta (1392-1392)

ResponsesStreamResponseTypeReasoningSummaryTextDone (1393-1393)

ResponsesToolMessage (461-481)

ResponsesToolFunction (1047-1050)

ResponsesMessageRoleType (328-328)

core/providers/utils/utils.go (1)

GetRandomString (1001-1012)

⏰ Context from checks skipped due to timeout of 900000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Graphite / mergeability_check

🔇 Additional comments (33)

ui/package.json (1)

46-46: Verify the intentionality of the semantic versioning change for lucide-react.

The dependency has been changed from an exact version pin ("0.542.0") to a caret range ("^0.542.0"), which deviates from the project's apparent versioning strategy. Throughout the rest of package.json, all dependencies are pinned to exact versions (e.g., "3.2.12", "5.2.1", "19.2.1").

This makes lucide-react an exception. While caret ranges allow for minor/patch updates (which are typically safe for a UI icon library), the inconsistency could introduce subtle version-related issues if new minor versions of lucide-react are released with behavioral changes. Given that this is the only change in this file and the PR's focus is on reasoning content blocks, clarify whether this change is intentional or if all dependencies should remain pinned to exact versions for reproducibility.

Additionally, verify that the reasoning/thinking content block changes elsewhere in the PR do not depend on or require specific new icons from lucide-react that would justify a version bump.

ui/app/workspace/logs/views/columns.tsx (1)

42-43: The review comment is based on incorrect information about the code. The TranscriptionInput interface in the schema only contains a file field and never had a prompt field. A search of the entire codebase confirms no references to transcription_input.prompt exist. The current implementation (return "Audio file";) is correct and appropriate for displaying transcription entries in the log table. The detailed transcription data is available in dedicated detail views.

Likely an incorrect or invalid review comment.

core/providers/openai/types.go (3)

4-4: LGTM!

Import is necessary for json.RawMessage used in the new MarshalJSON method. Using json.RawMessage (a []byte alias) alongside sonic for marshal operations is fine.

104-135: LGTM!

Correct pattern for handling embedded structs with custom unmarshallers. The two-phase unmarshal approach properly preserves both the request-specific fields and the ChatParameters with its custom logic.

187-225: Implementation is correct; minor documentation nit.

The shadowing approach correctly ensures reasoning.max_tokens is excluded from OpenAI requests while preserving other reasoning fields. The json.RawMessage technique for Input properly delegates to its custom marshaller.

Minor nit: The comment on line 188 says "parameters.reasoning.max_tokens" but since ResponsesParameters is embedded, the actual JSON path is just reasoning.max_tokens.

The MaxTokens exclusion here is intentional and correct—Anthropic and Bedrock populate MaxTokens from their budget tokens (as the schema comment indicates it's "required for anthropic"), but OpenAI's API does not support the reasoning.max_tokens parameter, so nullifying it is the right approach.

ui/app/workspace/logs/views/logResponsesMessageView.tsx (1)

273-274: Nice addition for long content wrapping

Adding break-words on the string content branch should prevent horizontal overflow for long tokens without affecting existing formatting.

core/providers/bedrock/utils.go (1)

15-18: Graceful handling of nil ChatParameters

Treating bifrostReq.Params == nil as a no-op and returning nil here matches the “params optional” semantics and keeps callers simple.

core/providers/utils/utils.go (2)

322-335: Centralized body decoding in error handler looks solid

Routing provider error bodies through CheckAndDecodeBody (with gzip support) before unmarshalling standardizes behavior across providers and ensures structured errors even when responses are compressed.

1000-1012: Random string helper is fine for non‑security identifiers

GetRandomString now validates length <= 0 and returns "" in that case, avoiding panics. Using math/rand and a simple a‑z0‑9 alphabet is appropriate for cosmetic IDs; just avoid reusing this for anything security‑sensitive.

core/providers/vertex/errors.go (1)

14-41: Decoded‑body error handling is consistent and robust

Using CheckAndDecodeBody up front and then unmarshalling from decodedBody preserves the existing fallback chain while handling gzip (and future encodings) correctly. Returning a ErrProviderResponseDecode Bifrost error on decode failure is also a reasonable, explicit failure mode.

core/providers/bedrock/bedrock_test.go (2)

360-377: Stronger Bedrock⇔Responses conversion contracts in tests

The added Type/Role fields, explicit Status: Ptr("completed") on function calls, OrderedMap expectations for additional model fields, and the structure-based assertions for responses all strengthen the invariants around Bedrock–Responses conversions while avoiding brittleness from runtime IDs and timestamps. This is a solid tightening of test coverage without over-specifying internals.

Also applies to: 404-429, 452-473, 496-521, 560-593, 630-671, 695-711, 733-749, 783-811, 920-975, 989-1010, 1050-1086, 1110-1125, 1127-1255, 1283-1316, 1414-1486, 1507-1519

1653-1897: Interleaved tool_use/tool_result regression test is well scoped

TestInterleavedToolCallsWithAssistantMessage faithfully encodes the reported message sequence and asserts that the converted Bedrock request preserves assistant messages and never produces more tool_result blocks than preceding tool_use blocks per assistant/user pair. This is an appropriate, targeted guard against the original validation error.

core/schemas/responses.go (1)

68-84: Schema extensions for stop reasons, signatures, and reasoning summaries look consistent

Adding StopReason on BifrostResponsesResponse, per-part Signature on ResponsesMessageContentBlock, the dedicated ResponsesReasoningSummary type (and Summary []ResponsesReasoningSummary on ResponsesReasoning), plus Signature on BifrostResponsesStreamResponse are all backwards-compatible optional fields that align with the new reasoning/streaming needs and match how the UI and provider adapters are consuming these structures.

Also applies to: 399-410, 729-747, 1426-1465

core/providers/gemini/responses.go (2)

167-179: ThoughtSignature and reasoning summary wiring is coherent

Emitting a separate ResponsesMessageTypeReasoning message with ResponsesReasoning{Summary: []ResponsesReasoningSummary{}, EncryptedContent: thoughtSig} when part.ThoughtSignature is present, and then re-attaching that encrypted content back onto the Gemini Part.ThoughtSignature in convertResponsesMessagesToGeminiContents, gives you a clean, provider-agnostic representation in the middle while still satisfying Gemini 3 Pro’s thought-signature requirements.

Also applies to: 531-535

138-179: The review's justification code does not exist in the codebase

The code at lines 138–179 does show argumentsStr being initialized to "" and Arguments set to &argumentsStr even when empty, but the claimed problem location and error path cannot be verified. The specific sonic.Unmarshal code cited in convertResponsesMessagesToGeminiContents does not exist, the error message is not present in the codebase, and lines 596–618 do not contain similar FunctionCall handling code. Without a verifiable round-trip code path or test demonstrating the failure scenario, the proposed fix cannot be properly justified or validated.

Likely an incorrect or invalid review comment.

framework/streaming/responses.go (4)

497-534: Well-structured reasoning delta handling with proper ID-based message lookup.

The implementation correctly:

Searches for existing reasoning messages by ItemID before creating new ones

Uses proper type initialization with ResponsesMessageTypeReasoning and ResponsesInputMessageRoleAssistant

Delegates to focused helper methods for delta and signature appending

626-679: Reasoning delta accumulation handles both content-indexed and summary paths correctly.

The dual-path logic is sound:

With contentIndex: appends to Content.ContentBlocks[*contentIndex].Text

Without contentIndex: accumulates into ResponsesReasoning.Summary[0].Text

One minor observation: when accumulating without content index, the code always appends to Summary[0], which is documented as intentional ("for now, accumulate into a single summary entry"). This is reasonable for current use cases.

681-727: Signature handling mirrors delta logic appropriately.

The implementation correctly:

With contentIndex: sets/appends to Content.ContentBlocks[*contentIndex].Signature

Without contentIndex: sets/appends to ResponsesReasoning.EncryptedContent

This aligns with the schema definitions where Signature is a content block field and EncryptedContent is a reasoning-level field.

865-867: Improved nil safety when accessing RawRequest.

The added nil checks prevent potential panics when result or result.ResponsesStreamResponse is nil before accessing ExtraFields.RawRequest.

core/providers/cohere/responses.go (4)

316-364: Reasoning block streaming implemented correctly with proper state tracking.

The implementation:

Generates stable IDs using providerUtils.GetRandomString(50)

Tracks reasoning content indices in state.ReasoningContentIndices

Emits both OutputItemAdded and ContentPartAdded events

This aligns with the OpenAI-style streaming lifecycle.

408-482: Content end handling properly distinguishes reasoning vs text blocks.

The code:

Checks state.ReasoningContentIndices[*chunk.Index] to determine block type

Emits ReasoningSummaryTextDone for reasoning blocks, OutputTextDone for text

Cleans up tracking with delete(state.ReasoningContentIndices, *chunk.Index)

Always emits OutputItemDone for all block types

1122-1144: Tool choice mapping now correctly handles "auto".

The previous review flagged that "auto" was incorrectly mapped to ToolChoiceRequired. This is now fixed:

"auto" → ToolChoiceAuto

"required", "function" → ToolChoiceRequired

"none" → ToolChoiceNone

1214-1223: Encrypted content still exposed in plain text marker.

The past review flagged security concerns about wrapping encrypted content in [ENCRYPTED_REASONING: ...]. This exposes the encrypted content in an unprotected format.

The suggestion was to skip encrypted content entirely since Cohere doesn't support it. Consider whether this marker format is intentional for debugging or if it should be removed for security.

Is exposing encrypted reasoning content with a plain text marker the intended behavior, or should this be skipped/handled differently for security?

core/providers/openai/text.go (1)

19-20: User field sanitization aligns with chat request handling.

The sanitization ensures the User field doesn't exceed OpenAI's 64-character limit. This mirrors the same logic applied in ToOpenAIChatRequest.

Note that SanitizeUserField silently drops (returns nil) fields exceeding the limit rather than truncating. This is a design choice - verify this silent dropping is acceptable, as it could cause unexpected behavior if callers rely on the User field being set.

Is silently nullifying the User field (vs. truncating or returning an error) the intended behavior across all request types?

core/providers/openai/chat.go (1)

33-34: Consistent user field sanitization with text completion requests.

The sanitization logic is applied consistently across both chat and text completion request conversions, using the shared SanitizeUserField helper from utils.go.

core/providers/anthropic/chat.go (1)

600-624: Remove this concern—the change is correct and intentional.

Empty PartialJSON chunks are intentionally created upstream (see responses.go:3565) as part of the streaming protocol for chunking JSON arguments. The original condition that filtered out empty strings (chunk.Delta.PartialJSON != nil && *chunk.Delta.PartialJSON != "") was overly restrictive and prevented legitimate empty deltas from being emitted. The relaxed condition correctly allows all non-nil PartialJSON values, including empty strings, to flow through the streaming response.

Likely an incorrect or invalid review comment.

core/providers/anthropic/errors.go (1)

42-58: LGTM - Clean implementation leveraging existing conversion logic.

The function correctly reuses ToAnthropicChatCompletionError for the error structure and formats it as an SSE event. The SSE format matches Anthropic's streaming protocol.

transports/bifrost-http/integrations/anthropic.go (1)

193-206: LGTM - OAuth passthrough logic correctly gated.

The refactored logic appropriately handles the OAuth flow by preserving headers and URL path only when API key authentication is not present, allowing proper passthrough to Anthropic's API.

core/providers/openai/responses.go (2)

42-84: LGTM - Reasoning content transformation handles provider-specific requirements.

The logic correctly:

Filters out reasoning content blocks for non-gpt-oss OpenAI models (which don't support them)

Converts summaries to reasoning content blocks for gpt-oss models

Preserves encrypted content when present

This enables reasoning interoperability across providers with different capabilities.

97-98: User field silently dropped when exceeding 64 characters.

The sanitization correctly enforces OpenAI's limit, but users won't receive feedback when their user identifier is discarded. Consider whether logging this condition would help debugging.

core/providers/anthropic/types.go (3)

135-145: LGTM - New redacted thinking content block type.

The AnthropicContentBlockTypeRedactedThinking constant enables proper handling of Anthropic's extended thinking feature where some reasoning content may be encrypted or redacted for privacy/security reasons.

153-153: LGTM - Data field for encrypted content.

The Data field appropriately supports encrypted content in redacted thinking blocks, complementing the new redacted_thinking content block type.

299-305: Cache fields correctly mirror Anthropic's API behavior.

As confirmed in prior review discussion, Anthropic's API always includes these fields even when values are zero. Removing omitempty ensures faithful round-trip serialization.

core/providers/cohere/responses.go

core/providers/openai/utils.go

Pratham-Mishra04 requested review from akshaydeo and danpiths December 4, 2025 13:13

Pratham-Mishra04 mentioned this pull request Dec 4, 2025

feat: support raw response accumulation in stream accumulator #999

Open

7 tasks

coderabbitai bot requested changes Dec 4, 2025

View reviewed changes

core/providers/cohere/responses.go Show resolved Hide resolved

core/providers/cohere/responses.go Show resolved Hide resolved

core/providers/openai/openai.go Outdated Show resolved Hide resolved

core/providers/utils/utils.go Show resolved Hide resolved

Pratham-Mishra04 force-pushed the 12-04-feat_responses_reasoning_fixes branch from a15c48b to d4bfce4 Compare December 4, 2025 15:55

Pratham-Mishra04 force-pushed the 12-04-feat_raw_response_accumulation_for_streaming branch from 6cf3108 to 4b4a584 Compare December 4, 2025 15:55

coderabbitai bot requested a review from TejasGhatte December 4, 2025 16:12

coderabbitai bot requested changes Dec 4, 2025

View reviewed changes

core/providers/anthropic/errors.go Show resolved Hide resolved

coderabbitai bot requested changes Dec 4, 2025

View reviewed changes

core/providers/cohere/responses.go Show resolved Hide resolved

core/providers/cohere/responses.go Show resolved Hide resolved

Pratham-Mishra04 force-pushed the 12-04-feat_responses_reasoning_fixes branch from d4bfce4 to bf9c361 Compare December 4, 2025 17:58

coderabbitai bot requested changes Dec 4, 2025

View reviewed changes

core/providers/anthropic/types.go Outdated Show resolved Hide resolved

core/providers/utils/utils.go Show resolved Hide resolved

Pratham-Mishra04 changed the base branch from 12-04-feat_raw_response_accumulation_for_streaming to graphite-base/1000 December 5, 2025 14:01

Pratham-Mishra04 force-pushed the graphite-base/1000 branch from 4b4a584 to a97e22b Compare December 5, 2025 14:01

Pratham-Mishra04 force-pushed the 12-04-feat_responses_reasoning_fixes branch from bf9c361 to bcef5b4 Compare December 5, 2025 14:01

Pratham-Mishra04 changed the base branch from graphite-base/1000 to 12-05-feat_send_back_raw_request_support December 5, 2025 14:02

Pratham-Mishra04 mentioned this pull request Dec 5, 2025

feat: send back raw request in extra fields #1010

Open

17 tasks

coderabbitai bot requested changes Dec 5, 2025

View reviewed changes

core/providers/openai/responses.go Show resolved Hide resolved

transports/bifrost-http/integrations/anthropic.go Show resolved Hide resolved

Pratham-Mishra04 force-pushed the 12-04-feat_responses_reasoning_fixes branch from bcef5b4 to 4f289b9 Compare December 5, 2025 14:25

Pratham-Mishra04 force-pushed the 12-05-feat_send_back_raw_request_support branch 2 times, most recently from 4ab2a0a to 10060d1 Compare December 5, 2025 14:29

Pratham-Mishra04 force-pushed the 12-04-feat_responses_reasoning_fixes branch from 4f289b9 to b3244b9 Compare December 5, 2025 14:29

Pratham-Mishra04 linked an issue Dec 6, 2025 that may be closed by this pull request

add support for anthropic passthrough (claude code support) for azure and bedrock #985

Open

Pratham-Mishra04 force-pushed the 12-05-feat_send_back_raw_request_support branch from 10060d1 to d6466cb Compare December 6, 2025 10:05

Pratham-Mishra04 force-pushed the 12-04-feat_responses_reasoning_fixes branch from b3244b9 to e04023a Compare December 6, 2025 10:05

coderabbitai bot requested changes Dec 6, 2025

View reviewed changes

core/providers/anthropic/types.go Show resolved Hide resolved

core/providers/cohere/responses.go Outdated Show resolved Hide resolved

TejasGhatte mentioned this pull request Dec 8, 2025

feat: added support for gemini native converters #1018

Open

16 tasks

Pratham-Mishra04 force-pushed the 12-04-feat_responses_reasoning_fixes branch from e04023a to bb6c8dc Compare December 8, 2025 08:24

coderabbitai bot reviewed Dec 8, 2025

View reviewed changes

Pratham-Mishra04 force-pushed the 12-04-feat_responses_reasoning_fixes branch from bb6c8dc to 68bd009 Compare December 8, 2025 11:54

Pratham-Mishra04 force-pushed the 12-05-feat_send_back_raw_request_support branch from d6466cb to 1cb0123 Compare December 8, 2025 11:54

coderabbitai bot requested changes Dec 8, 2025

View reviewed changes

core/providers/cohere/responses.go Show resolved Hide resolved

core/providers/openai/utils.go Show resolved Hide resolved

Pratham-Mishra04 force-pushed the 12-05-feat_send_back_raw_request_support branch from 1cb0123 to f40e99c Compare December 8, 2025 13:33

Pratham-Mishra04 force-pushed the 12-04-feat_responses_reasoning_fixes branch from 68bd009 to e17bfb2 Compare December 8, 2025 13:33

Pratham-Mishra04 added 2 commits December 8, 2025 19:13

feat: responses reasoning fixes

df5c138

feat: anthropic integration convertor enhancements

4276354

Pratham-Mishra04 force-pushed the 12-05-feat_send_back_raw_request_support branch from f40e99c to 927142b Compare December 8, 2025 13:45

Pratham-Mishra04 force-pushed the 12-04-feat_responses_reasoning_fixes branch from e17bfb2 to 4276354 Compare December 8, 2025 13:45

coderabbitai bot approved these changes Dec 8, 2025

View reviewed changes

This was referenced Dec 9, 2025

fix: reroute requests incoming from integrations for full compatibility in custom providers #1030

Open

fix: gemini transcription test cases #1031

Draft

feat: responses reasoning fixes #1000

Are you sure you want to change the base?

feat: responses reasoning fixes #1000

Uh oh!

Conversation

Pratham-Mishra04 commented Dec 4, 2025

Summary

Changes

Type of change

Affected areas

How to test

Screenshots/Recordings

Breaking changes

Related issues

Security considerations

Checklist

Uh oh!

Pratham-Mishra04 commented Dec 4, 2025 • edited by TejasGhatte Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Summary by CodeRabbit

Release Notes

Walkthrough

Changes

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Pratham-Mishra04 commented Dec 4, 2025 •

edited by TejasGhatte

Loading

coderabbitai bot commented Dec 4, 2025 •

edited

Loading