Skip to content

Conversation

@KUNJ1311
Copy link

Summary

Implements model filtering for the /v1/models endpoint based on virtual key's provider_configs. This allows controlling which models are visible to specific virtual keys by specifying allowed models, providing fine-grained access control at the API level.

Changes

  • Added filterModelsByVirtualKey() function in transports/bifrost-http/handlers/inference.go to filter models based on virtual key's provider_configs
  • Integrated filtering in the listModels handler for /v1/models endpoint (applied only when governance is enabled)
  • Modified NewInferenceHandler() to accept governance store parameter and added governanceStore field to CompletionHandler struct
  • Updated server initialization (server/server.go) to inject governance store into inference handler
  • Added comprehensive test suite (test_virtual_keys_access.py) with 9 test cases covering all filtering scenarios
  • Updated documentation in tests/governance/README.md and docs/features/governance/virtual-keys.mdx with detailed /v1/models filtering behavior

Design decisions:

  • Filtering happens at the HTTP handler level after models are fetched from bifrost core
  • Only applies when EnableGovernance config is true
  • Returns all models when no VK header is provided (backward compatible)
  • Empty provider_configs = unrestricted access, empty allowed_models = all models from provider
  • Fail-safe: returns unfiltered models if governance store is nil or VK not found

Type of change

  • Bug fix
  • Feature
  • Refactor
  • Documentation
  • Chore/CI

Affected areas

  • Core (Go)
  • Transports (HTTP) - Inference handler, Server initialization
  • Providers/Integrations
  • Plugins
  • UI (Next.js)
  • Docs
  • Tests (Governance)

How to test

1. Start Bifrost server

make dev

2. Run the test suite

cd tests/governance
pytest test_virtual_keys_access.py -v

Expected outcome: All 9 tests pass

  • test_list_models_with_vk_filters_by_provider_config - Filters models by VK with specific allowed_models
  • test_list_models_with_empty_allowed_models_returns_all_provider_models - Empty allowed_models returns all models from that provider
  • test_list_models_without_provider_configs_returns_all - Empty provider_configs returns all models (unrestricted)
  • test_list_models_with_nonexistent_provider_returns_empty - Non-existent providers return empty list
  • test_list_models_with_multiple_providers - Filters models from multiple providers in provider_configs
  • test_list_models_without_vk_header_returns_all - Without VK header returns all models (backward compatible)
  • test_list_models_with_invalid_vk_header - Invalid VK header returns all models without filtering
  • test_list_models_with_inactive_vk - Inactive VK returns all models without filtering
  • test_list_models_with_mixed_provider_configs - One provider with specific models, another with all models

4. Manual testing

Create a virtual key with allowed models:

List models:

# List models without VK (returns all models)
curl http://localhost:8080/v1/models

# List models with VK (returns only the allowed models based on provider_configs)
curl http://localhost:8080/v1/models -H "x-bf-vk: <vk-value>"

Expected behavior:

  • Without VK header: All models from all configured providers
  • With VK header: Only models matching the virtual key's provider_configs
    • If provider_configs is empty → all models
    • If allowed_models is empty for a provider → all models from that provider
    • If allowed_models has specific models → only those models

Screenshots/Recordings

N/A - Backend API feature

Breaking changes

  • Yes
  • No

This is a backward-compatible addition. Existing behavior (returning all models without VK header) is preserved.

Related issues

[Feature]: filter models in /v1/models based on virtual key #808

Security considerations

  • Model filtering enforces access control at the API level
  • Virtual key validation already handled by existing governance middleware
  • No new security vulnerabilities introduced
  • Filter logic is fail-safe: returns all models if governance is disabled, governance store is nil, or VK is not found (preserves existing behavior)
  • Filtering only applies when EnableGovernance is true in config

Checklist

  • I read docs/contributing/README.md and followed the guidelines
  • I added/updated tests where appropriate (9 comprehensive test cases)
  • I updated documentation where needed (README.md and virtual-keys.mdx)
  • I verified builds succeed (Go and UI)
  • I verified the CI pipeline passes locally if applicable (all tests passing)"

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 19, 2025

📝 Walkthrough

Summary by CodeRabbit

Release Notes

  • New Features

    • Virtual keys can now restrict which models are visible when using the model listing endpoint based on provider and model allowlists
    • Access control policies automatically apply to both model discovery and inference requests
  • Documentation

    • Added comprehensive guides on virtual key model filtering behavior, including configuration examples and edge cases

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

Inference model-listing now consults an optional GovernanceStore and applies virtual-key (x-bf-vk) provider_configs to filter models returned by GET /v1/models; documentation and tests were added and server wiring updated to pass the GovernanceStore into the inference handler.

Changes

Cohort / File(s) Summary
Documentation
docs/features/governance/virtual-keys.mdx
Adds explanation and examples for how the x-bf-vk header filters models returned by GET /v1/models, covering header present/absent, empty provider_configs, and empty allowed_models.
Test docs
tests/governance/README.md
Documents the new virtual-key model filtering tests, the @pytest.mark.access_control marker, and the GET /v1/models behavior under integration endpoints.
Tests
tests/governance/test_virtual_keys_access.py
New TestModelsFiltering test suite covering provider_configs / allowed_models filtering, empty configs, nonexistent/multiple providers, missing/invalid/inactive VK handling, mixed provider configs, and response shape assertions.
Inference handler
transports/bifrost-http/handlers/inference.go
Adds governanceStore *governance.GovernanceStore to CompletionHandler, updates NewInferenceHandler signature, adds getVirtualKeyProviders(...) and filterModelsByAllowedModels(...), and applies VK-based provider/model filtering in listModels.
Server wiring
transports/bifrost-http/server/server.go
Discovers governance plugin, obtains GovernanceStore (if present) and passes it into NewInferenceHandler(...); minor formatting and explicit return added.
Bifrost core
core/bifrost.go
Adds ListModelsFromProviders(ctx, providers, request) to fetch models concurrently per-provider and refactors ListAllModels to use the new helper.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Server as HTTP Server
    participant Handler as InferenceHandler
    participant GovStore as GovernanceStore
    participant Bifrost as BifrostClient

    Client->>Server: GET /v1/models (may include x-bf-vk)
    Server->>Handler: listModels(request)
    Handler->>Bifrost: ListModelsFromProviders / ListAllModels
    Bifrost-->>Handler: models[]

    alt GovernanceStore present AND x-bf-vk header present
        Handler->>GovStore: Resolve VK -> provider_configs
        GovStore-->>Handler: provider_configs
        Handler->>Handler: getVirtualKeyProviders(provider_configs)
        Handler->>Handler: filterModelsByAllowedModels(allowedProviders, models)
        Handler-->>Server: filtered models
    else No governance or no header
        Handler-->>Server: models (unfiltered)
    end

    Server-->>Client: JSON models list
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Inspect getVirtualKeyProviders / filterModelsByAllowedModels for correct provider → allowed_models mapping and empty-list semantics.
  • Verify NewInferenceHandler signature updates and all call sites (server wiring).
  • Review ListModelsFromProviders concurrency, pagination, aggregation, and its integration into ListAllModels.
  • Validate new tests for environment assumptions, cleanup, and skip conditions.

Poem

🐰 I nibble docs and tests with cheer,

x-bf-vk shows models near,
Hops through providers, counts each name,
Filters tidy, none the same,
A merry hop — access clear ✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: adding model filtering for the /v1/models endpoint based on virtual key provider_configs.
Description check ✅ Passed The description includes all required sections: Summary, Changes, Type of change, Affected areas, How to test, Breaking changes, Related issues, Security considerations, and Checklist. All major aspects are well-documented.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (4)
tests/governance/README.md (1)

51-58: Docs align with behavior; consider wiring in the new marker

The new section for test_virtual_keys_access.py and the /v1/models integration description correctly reflect the VK-based model filtering semantics (provider_configs, empty configs, invalid/inactive VK). If you intend to use the new @pytest.mark.access_control marker for these tests, consider tagging the new suite accordingly so the documented marker is actually exercised.

Also applies to: 173-174, 205-212

transports/bifrost-http/handlers/inference.go (1)

28-41: VK-based model filtering logic matches the documented semantics

The governance wiring into CompletionHandler and NewInferenceHandler, plus filterModelsByVirtualKey, correctly implement the described rules:

  • governanceStore == nil, bifrostCtx == nil, missing/empty x-bf-vk, unknown VK, or inactive VK all fall back to returning the original models slice (fail-open behavior).
  • Empty vk.ProviderConfigs (len(...) == 0) treats the VK as unrestricted and returns all models.
  • Provider filtering is done by pc.Provider == string(modelProvider) with modelProvider parsed from model.ID via ParseModelString, so only configured providers are included when ProviderConfigs is non-empty.
  • Model filtering honors allowed_models per provider, accepting either bare model names (modelName) or fully qualified IDs (model.ID), and treating an empty allowed_models slice as “all models for that provider”.

This matches both the tests and the /v1/models docs. The logic is straightforward and avoids nil panics on slices. One optional improvement would be to add a brief comment noting that unprefixed model IDs (no "provider/" prefix) are intentionally excluded whenever ProviderConfigs is non-empty, to make that edge case explicit for future maintainers.

Also applies to: 261-319

docs/features/governance/virtual-keys.mdx (1)

18-25: Docs correctly describe VK-based model filtering (minor JSON nit)

The updated “Access Control” feature and /v1/models section line up with the implementation: header presence, provider_configs scoping, and the empty provider_configs / allowed_models behavior all match the handler logic and tests. As a minor polish, consider moving the explanatory // GET /v1/models ... line outside the ```json block or changing the block type to plain ` ```` so the example isn’t syntactically invalid JSON.

Also applies to: 502-528

tests/governance/test_virtual_keys_access.py (1)

18-307: Test coverage for VK-based /v1/models filtering is solid (consider adding request timeouts)

The scenarios here (specific allowed_models, empty allowed_models, empty provider_configs, nonexistent providers, multiple providers) closely exercise the new filtering logic and match the documented behavior. One practical improvement is to pass a small timeout= to all requests.get(...) calls so CI hangs don’t turn into indefinite waits (also satisfying the S113 lint). For example, you could define a shared constant, e.g. REQUEST_TIMEOUT = 10, and use requests.get(..., timeout=REQUEST_TIMEOUT) across the suite.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ac46d89 and 38b56ab.

📒 Files selected for processing (5)
  • docs/features/governance/virtual-keys.mdx (2 hunks)
  • tests/governance/README.md (3 hunks)
  • tests/governance/test_virtual_keys_access.py (1 hunks)
  • transports/bifrost-http/handlers/inference.go (3 hunks)
  • transports/bifrost-http/server/server.go (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
transports/bifrost-http/handlers/inference.go (6)
transports/bifrost-http/lib/config.go (2)
  • HandlerStore (35-38)
  • Config (139-169)
plugins/governance/store.go (1)
  • GovernanceStore (18-30)
core/schemas/models.go (1)
  • Model (109-129)
core/schemas/bifrost.go (1)
  • BifrostContextKeyVirtualKey (105-105)
core/schemas/utils.go (1)
  • ParseModelString (21-34)
framework/configstore/clientconfig.go (1)
  • ClientConfig (30-42)
transports/bifrost-http/server/server.go (3)
plugins/governance/store.go (1)
  • GovernanceStore (18-30)
plugins/governance/main.go (1)
  • GovernancePlugin (41-60)
transports/bifrost-http/handlers/inference.go (1)
  • NewInferenceHandler (35-42)
🪛 Ruff (0.14.5)
tests/governance/test_virtual_keys_access.py

28-28: Probable use of requests call without timeout

(S113)


70-70: Probable use of requests call without timeout

(S113)


103-103: Probable use of requests call without timeout

(S113)


138-138: Probable use of requests call without timeout

(S113)


163-163: Probable use of requests call without timeout

(S113)


183-183: Probable use of requests call without timeout

(S113)


221-221: Probable use of requests call without timeout

(S113)


235-235: Probable use of requests call without timeout

(S113)


292-292: Probable use of requests call without timeout

(S113)


320-320: Unused method argument: governance_client

(ARG002)


322-322: Probable use of requests call without timeout

(S113)


330-330: Unused method argument: governance_client

(ARG002)


332-332: Probable use of requests call without timeout

(S113)


337-337: Probable use of requests call without timeout

(S113)


353-353: Probable use of requests call without timeout

(S113)


388-388: Probable use of requests call without timeout

(S113)


409-409: Probable use of requests call without timeout

(S113)


457-457: Probable use of requests call without timeout

(S113)


463-463: Local variable filtered_ids is assigned to but never used

Remove assignment to unused variable filtered_ids

(F841)

🔇 Additional comments (2)
transports/bifrost-http/handlers/inference.go (1)

350-408: Conditional hook into VK filtering is appropriately scoped

The listModels handler now only applies filterModelsByVirtualKey when ClientConfig.EnableGovernance is true, and passes the same bifrostCtx used for the core list call along with the shared governanceStore. Given the early-return guards in filterModelsByVirtualKey, this is safe even if the governance plugin failed to initialize or the VK is missing/invalid, and it keeps behavior fully backward compatible when governance is disabled or the header is absent.

transports/bifrost-http/server/server.go (1)

843-856: GovernanceStore wiring into inference routes is safe and non-intrusive

RegisterInferenceRoutes now looks up the governance plugin, pulls its GovernanceStore if available, and passes it into NewInferenceHandler. Ignoring the error from FindPluginByName and defaulting governanceStore to nil keeps this path safe when the plugin is disabled or failed to load; combined with handler-side guards, inference remains fully functional without governance. This is a clean way to expose governance state to /v1/models without impacting other routes.

@akshaydeo
Copy link
Contributor

Thanks @KUNJ1311 for the contribution ❤️

- Remove unused governance_client fixture from tests that don't need it
- Remove unused filtered_ids variable in test_list_models_with_mixed_provider_configs
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
tests/governance/test_virtual_keys_access.py (3)

24-96: Provider-config filtering test is thorough; helper extraction would be purely cosmetic

The logic for discovering a provider with ≥2 models, creating the VK, and asserting both inclusion and exclusion (plus basic schema fields) is solid and captures the intended behavior well. If this provider/model discovery pattern ends up repeated elsewhere, consider extracting it into a small helper (or using collections.defaultdict(list) for providers_map) for readability, but there’s no need to change anything for correctness here.


28-459: Consider adding explicit timeouts to requests.get calls in tests

All HTTP calls here use requests.get(...) without a timeout, which triggers Ruff S113 and can leave tests hanging indefinitely if the server stops responding. Even for integration tests, adding a small explicit timeout (e.g., via a module-level REQUEST_TIMEOUT = 5 and using requests.get(..., timeout=REQUEST_TIMEOUT), or a helper in conftest) makes the suite more robust and should satisfy the linter.


330-402: Tighten expectations for invalid/inactive VK behavior if “fail-safe 200 + unfiltered” is a hard requirement

These tests currently accept 200, 401, or 403 for invalid and inactive VKs, only asserting “unfiltered models” on the 200 path. The PR description, however, says fail-safe behavior for invalid/inactive VKs is to return unfiltered models, which implies a 200-success path rather than 4xx. If that behavior is indeed contractual, you may want to restrict the assertion to status_code == 200 and always compare returned IDs to the baseline set so the tests will catch any future deviation. If different deployments are allowed to return 4xx here, leaving the current flexible assertions is fine.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 38b56ab and d4e870d.

📒 Files selected for processing (1)
  • tests/governance/test_virtual_keys_access.py (1 hunks)
🧰 Additional context used
🪛 Ruff (0.14.5)
tests/governance/test_virtual_keys_access.py

28-28: Probable use of requests call without timeout

(S113)


70-70: Probable use of requests call without timeout

(S113)


103-103: Probable use of requests call without timeout

(S113)


138-138: Probable use of requests call without timeout

(S113)


163-163: Probable use of requests call without timeout

(S113)


183-183: Probable use of requests call without timeout

(S113)


221-221: Probable use of requests call without timeout

(S113)


235-235: Probable use of requests call without timeout

(S113)


292-292: Probable use of requests call without timeout

(S113)


322-322: Probable use of requests call without timeout

(S113)


332-332: Probable use of requests call without timeout

(S113)


337-337: Probable use of requests call without timeout

(S113)


353-353: Probable use of requests call without timeout

(S113)


388-388: Probable use of requests call without timeout

(S113)


409-409: Probable use of requests call without timeout

(S113)


457-457: Probable use of requests call without timeout

(S113)

🔇 Additional comments (1)
tests/governance/test_virtual_keys_access.py (1)

405-478: Mixed provider-configs test cleanly validates combined “restricted + unrestricted” behavior

This test nicely exercises the case where one provider is constrained to a single model and another is unrestricted, and the assertions on expected_count, per-provider membership, and exact equality for the second provider look correct. The prior unused filtered_ids issue mentioned in earlier reviews is resolved here, and the structure is clear and maintainable.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
core/bifrost.go (1)

300-419: ListModelsFromProviders ignores request.ExtraParams, diverging from single-provider behaviour

The helper looks solid overall (concurrency, error aggregation, and MaxPaginationRequests handling), but two nuances are worth calling out:

  1. ExtraParams are dropped for multi-provider fetches

ListModelsFromProviders currently constructs a fresh BifrostListModelsRequest per provider with only Provider and PageSize: schemas.DefaultPageSize, ignoring request.ExtraParams (and other fields):

providerRequest := &schemas.BifrostListModelsRequest{
    Provider: providerKey,
    PageSize: schemas.DefaultPageSize,
}

In contrast, ListModelsRequest preserves ExtraParams when forwarding to the provider. For /v1/models, that means provider-specific query params are respected when len(providersToFetch) == 1 but silently ignored when multiple providers are queried via ListModelsFromProviders.

If the intent is to allow provider-specific filters even in the multi-provider case (or at least when all providers share semantics), consider threading ExtraParams through:

-           providerRequest := &schemas.BifrostListModelsRequest{
-               Provider: providerKey,
-               PageSize: schemas.DefaultPageSize,
-           }
+           providerRequest := &schemas.BifrostListModelsRequest{
+               Provider:    providerKey,
+               PageSize:    schemas.DefaultPageSize,
+               ExtraParams: request.ExtraParams,
+           }

If multi-provider + ExtraParams is intentionally unsupported, it’s worth documenting that contract on the function to avoid surprises later.

🧹 Nitpick comments (2)
transports/bifrost-http/handlers/inference.go (2)

21-41: Governance wiring and VK provider resolution behaviour

The new governance integration is generally well-shaped:

  • Injecting governanceStore *governance.GovernanceStore into CompletionHandler and wiring it via NewInferenceHandler keeps concerns localized while allowing HTTP-layer filtering.
  • getVirtualKeyProviders correctly implements the intended semantics:
    • Returns nil when governance is disabled, the context is nil, the VK header is absent/empty, or the VK is missing/invalid/inactive.
    • Returns an empty but non-nil map when vk.ProviderConfigs is empty, which listModels then interprets as “no VK restrictions”.
    • Produces map[schemas.ModelProvider][]string for non-empty provider_configs, giving you a clean structure to drive both provider and model-level filtering.

This matches the documented design decisions: backward compatibility when VK is absent, and a fail-open behaviour when governance is disabled or VK is invalid/inactive.

Also applies to: 261-295, 370-374


448-453: Provider-not-found handling and pricing guard

Two smaller notes:

  1. Provider-not-found special casing
if bifrostErr.Error != nil && bifrostErr.Error.Message == "provider not found for list models request" {
    SendJSON(ctx, &schemas.BifrostListModelsResponse{ Data: []schemas.Model{} })
    return
}

Returning an empty data array with 200 for an unknown provider is reasonable, but this couples the HTTP layer to an exact error message string. If you later change the wording in ListModelsRequest, this behaviour silently changes. Long-term, a more robust approach would be to key off a typed error code or ErrorField.Type instead of Message, but that’s non-blocking here.

  1. Skipping pricing overwrite when provider already set Pricing

The new early-continue on modelEntry.Pricing != nil before consulting PricingManager:

for i, modelEntry := range resp.Data {
    if modelEntry.Pricing != nil {
        continue
    }
    ...
    resp.Data[i].Pricing = pricing
}

is a sensible safeguard to avoid clobbering provider-supplied pricing with catalog defaults. This keeps the existing augmentation behaviour for models without Pricing while deferring to the provider when Pricing is present.

Both changes look fine as-is; consider only the error-typing refinement if/when you introduce structured error codes for this path.

Also applies to: 463-469

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d4e870d and 76beab8.

📒 Files selected for processing (2)
  • core/bifrost.go (3 hunks)
  • transports/bifrost-http/handlers/inference.go (4 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
transports/bifrost-http/handlers/inference.go (6)
transports/bifrost-http/lib/config.go (2)
  • HandlerStore (35-38)
  • Config (139-169)
plugins/governance/store.go (1)
  • GovernanceStore (18-30)
core/schemas/bifrost.go (2)
  • ModelProvider (32-32)
  • BifrostContextKeyVirtualKey (105-105)
core/schemas/models.go (3)
  • Model (109-129)
  • BifrostListModelsResponse (36-45)
  • Pricing (139-148)
core/schemas/utils.go (1)
  • ParseModelString (21-34)
transports/bifrost-http/handlers/utils.go (3)
  • SendError (35-44)
  • SendJSON (16-22)
  • SendBifrostError (47-62)
core/bifrost.go (2)
core/schemas/bifrost.go (3)
  • ModelProvider (32-32)
  • BifrostError (353-362)
  • ErrorField (371-378)
core/schemas/models.go (2)
  • BifrostListModelsRequest (23-34)
  • BifrostListModelsResponse (36-45)
🔇 Additional comments (2)
core/bifrost.go (1)

421-445: ListAllModels delegation + pagination looks consistent

The refactor to delegate to ListModelsFromProviders and then apply response.ApplyPagination(request.PageSize, request.PageToken) keeps the previous contract for ListAllModels while centralizing the concurrent fetch logic. This looks correct and keeps the public API behaviour stable.

transports/bifrost-http/handlers/inference.go (1)

296-325: Review comment is based on incorrect assumptions about model ID format

The flagged concern assumes some providers might return non-namespaced model IDs, but verification shows all providers consistently namespace IDs as provider/model: OpenAI, Bedrock, Azure, Cohere, Anthropic, and Gemini all use this pattern when constructing Model.ID values. Since ParseModelString correctly extracts the provider from prefixed IDs, the filtering logic works as intended without hardening.

Likely an incorrect or invalid review comment.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
core/bifrost.go (1)

300-420: Nil-dereference risk when filtering provider errors in ListModelsFromProviders

The loop around Line 353 assumes bifrostErr.Error is non-nil:

if bifrostErr != nil {
    if !strings.Contains(bifrostErr.Error.Message, "no keys found") &&
        !strings.Contains(bifrostErr.Error.Message, "not supported") {
        providerErr = bifrostErr
        bifrost.logger.Warn(...)
    }
    break
}

If any provider ever returns a *schemas.BifrostError with a nil Error field, this will panic. The rest of the codebase generally sets Error, but this function should be defensive.

You can guard and still keep the special-casing:

-            if !strings.Contains(bifrostErr.Error.Message, "no keys found") &&
-                !strings.Contains(bifrostErr.Error.Message, "not supported") {
-                providerErr = bifrostErr
-                bifrost.logger.Warn(fmt.Sprintf("failed to list models for provider %s: %s", providerKey, GetErrorMessage(bifrostErr)))
-            }
+            if bifrostErr.Error == nil {
+                // No structured error; treat as generic failure and log once
+                providerErr = bifrostErr
+                bifrost.logger.Warn(fmt.Sprintf("failed to list models for provider %s: %s", providerKey, GetErrorMessage(bifrostErr)))
+            } else if !strings.Contains(bifrostErr.Error.Message, "no keys found") &&
+                !strings.Contains(bifrostErr.Error.Message, "not supported") {
+                providerErr = bifrostErr
+                bifrost.logger.Warn(fmt.Sprintf("failed to list models for provider %s: %s", providerKey, GetErrorMessage(bifrostErr)))
+            }

This preserves the “no keys found” / “not supported” suppression but avoids a crash if Error is unexpectedly nil.

transports/bifrost-http/handlers/inference.go (1)

370-448: Missing test coverage and documentation for "provider not found" behavior change

The semantic change to return HTTP 200 with empty data for unknown providers (instead of error) is confirmed in the code at lines 451–456, but verification found critical gaps:

Test Coverage: No tests exist for this scenario. The existing test in tests/core-providers/scenarios/list_models.go only covers the happy path and would actually fail if an unknown provider returned empty data (it asserts len(response.Data) == 0 as a failure).

Documentation: The OpenAPI spec (docs/apis/openapi.json) documents only HTTP 200, 400, 500 responses but does not explain when 200 with empty data is returned versus when an error occurs. No docs reflect this behavior change.

Required:

  • Add tests explicitly covering both governance-enabled and governance-disabled cases with unknown providers (verify returns 200 empty, not error)
  • Update OpenAPI documentation to clarify the "provider not found" behavior and when it returns 200 vs error
  • If this behavior is intentional, ensure it's reflected consistently across all integration endpoints
🧹 Nitpick comments (3)
core/bifrost.go (1)

300-305: Clarify ListModelsFromProviders pagination semantics in doc comment

The new helper ignores request.PageSize and request.PageToken for upstream calls (it always uses DefaultPageSize and walks pages until exhaustion / MaxPaginationRequests), and then callers like ListAllModels apply global pagination via ApplyPagination:

providerRequest := &schemas.BifrostListModelsRequest{
    Provider:    providerKey,
    PageSize:    schemas.DefaultPageSize,
    ExtraParams: request.ExtraParams,
}
...
response = response.ApplyPagination(request.PageSize, request.PageToken)

This behavior is reasonable for aggregating multiple providers, but it’s non-obvious from the function name/signature that ListModelsFromProviders returns the full aggregated set and that the caller is responsible for applying final pagination.

I’d suggest tightening the comment to avoid misuse, e.g.:

-// ListModelsFromProviders fetches models concurrently from specified providers
+// ListModelsFromProviders fetches and fully paginates models concurrently from the specified providers.
+// It returns the complete aggregated list (sorted by ID); callers are responsible for applying any
+// cross-provider pagination (e.g., via ApplyPagination on the returned response).

This keeps the external behavior consistent with ListAllModels and helps future call sites understand that request.PageSize/PageToken are not honored per-provider here.

Also applies to: 422-447

transports/bifrost-http/handlers/inference.go (2)

261-325: VK helper semantics are sound; be explicit about model ID/provider assumptions

getVirtualKeyProviders and filterModelsByAllowedModels nicely encode the intended governance semantics:

  • Missing / nil governanceStore, missing VK, inactive VK → nil map → no filtering.
  • VK with empty ProviderConfigs → empty but non-nil map → treated as “no restrictions”.
  • For a given provider:
    • Empty AllowedModels slice → all models for that provider allowed.
    • Non-empty slice → matches either allowedModel == modelName or allowedModel == model.ID.

One potential gotcha: filterModelsByAllowedModels infers the provider from model.ID via:

modelProvider, modelName := schemas.ParseModelString(model.ID, "")
...
allowedModels, providerAllowed := allowedProviders[modelProvider]

This assumes model IDs are consistently of the form provider/model. If any provider returns plain IDs without a prefix (e.g. "gpt-4o"), modelProvider will be "", no entry in allowedProviders will match, and those models will always be filtered out for VKs that specify provider restrictions.

If that pattern isn’t globally enforced, consider augmenting the provider detection to fall back to other fields (e.g., TopProvider / OwnedBy) or to using the provider implied by the source of the list, e.g.:

  • Pass the provider alongside each model from core into this filter, or
  • Store provider on the model in a dedicated field and read from there.

That would make VK filtering robust even if some providers don’t encode the provider in Model.ID.


466-487: Respecting existing Pricing on models before augmenting is correct

The new guard:

if modelEntry.Pricing != nil {
    continue
}

ensures you don’t overwrite provider-supplied pricing when attaching catalog-based pricing metadata. The rest of the loop still keys off ParseModelString(modelEntry.ID, "") and PricingManager.GetPricingEntryForModel, so behavior is backward-compatible for models without explicit pricing.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 76beab8 and 4ede283.

📒 Files selected for processing (2)
  • core/bifrost.go (4 hunks)
  • transports/bifrost-http/handlers/inference.go (4 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
transports/bifrost-http/handlers/inference.go (8)
transports/bifrost-http/lib/config.go (2)
  • HandlerStore (35-38)
  • Config (139-169)
plugins/governance/store.go (1)
  • GovernanceStore (18-30)
core/schemas/bifrost.go (3)
  • ModelProvider (32-32)
  • BifrostContextKeyVirtualKey (105-105)
  • ListModelsRequest (86-86)
core/schemas/provider.go (1)
  • Provider (207-234)
core/schemas/models.go (3)
  • Model (109-129)
  • BifrostListModelsResponse (36-45)
  • Pricing (139-148)
core/schemas/utils.go (1)
  • ParseModelString (21-34)
framework/configstore/clientconfig.go (1)
  • ClientConfig (30-42)
transports/bifrost-http/handlers/utils.go (3)
  • SendError (35-44)
  • SendJSON (16-22)
  • SendBifrostError (47-62)
core/bifrost.go (3)
core/schemas/bifrost.go (3)
  • ModelProvider (32-32)
  • BifrostError (353-362)
  • ErrorField (371-378)
core/schemas/models.go (3)
  • BifrostListModelsRequest (23-34)
  • BifrostListModelsResponse (36-45)
  • DefaultPageSize (11-11)
core/schemas/provider.go (1)
  • Provider (207-234)
🔇 Additional comments (1)
transports/bifrost-http/handlers/inference.go (1)

21-24: Governance store wiring into CompletionHandler looks correct

Injecting *governance.GovernanceStore into CompletionHandler and threading it via NewInferenceHandler is clean, and downstream VK helpers already nil-check the store, so this is safe when governance is disabled or the plugin is not configured.

Also applies to: 28-32, 35-41

@KUNJ1311
Copy link
Author

@akshaydeo I have two questions:

  1. model pools in memory (ModelCatalog)
  • Do I need to use ModelCatalog for the /v1/models API?
  • If yes, I have some concerns:
    1. GetModelsForProvider() returns duplicate models in the array
    2. Currently only stores provider names + model lists (no metadata like max_tokens, descriptions, etc.)
    3. To use it, we'd need to cache all metadata in memory + fix the duplicates issue first
      Should I pursue this approach or is the current provider-fetching optimization sufficient?
  1. Invalid/Inactive VK behavior
  • Current flow returns all models (fail-safe) for invalid or inactive VKs
  • Should this return 403/401 errors instead?

Let me know your preference and I'll adjust accordingly!

@KUNJ1311 KUNJ1311 requested a review from akshaydeo November 20, 2025 14:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants