-
Notifications
You must be signed in to change notification settings - Fork 127
feat: filter /v1/models endpoint by virtual key provider_configs #883
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: filter /v1/models endpoint by virtual key provider_configs #883
Conversation
📝 WalkthroughSummary by CodeRabbitRelease Notes
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughInference model-listing now consults an optional GovernanceStore and applies virtual-key (x-bf-vk) provider_configs to filter models returned by GET /v1/models; documentation and tests were added and server wiring updated to pass the GovernanceStore into the inference handler. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Server as HTTP Server
participant Handler as InferenceHandler
participant GovStore as GovernanceStore
participant Bifrost as BifrostClient
Client->>Server: GET /v1/models (may include x-bf-vk)
Server->>Handler: listModels(request)
Handler->>Bifrost: ListModelsFromProviders / ListAllModels
Bifrost-->>Handler: models[]
alt GovernanceStore present AND x-bf-vk header present
Handler->>GovStore: Resolve VK -> provider_configs
GovStore-->>Handler: provider_configs
Handler->>Handler: getVirtualKeyProviders(provider_configs)
Handler->>Handler: filterModelsByAllowedModels(allowedProviders, models)
Handler-->>Server: filtered models
else No governance or no header
Handler-->>Server: models (unfiltered)
end
Server-->>Client: JSON models list
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (4)
tests/governance/README.md (1)
51-58: Docs align with behavior; consider wiring in the new markerThe new section for
test_virtual_keys_access.pyand the/v1/modelsintegration description correctly reflect the VK-based model filtering semantics (provider_configs, empty configs, invalid/inactive VK). If you intend to use the new@pytest.mark.access_controlmarker for these tests, consider tagging the new suite accordingly so the documented marker is actually exercised.Also applies to: 173-174, 205-212
transports/bifrost-http/handlers/inference.go (1)
28-41: VK-based model filtering logic matches the documented semanticsThe governance wiring into
CompletionHandlerandNewInferenceHandler, plusfilterModelsByVirtualKey, correctly implement the described rules:
governanceStore == nil,bifrostCtx == nil, missing/emptyx-bf-vk, unknown VK, or inactive VK all fall back to returning the originalmodelsslice (fail-open behavior).- Empty
vk.ProviderConfigs(len(...) == 0) treats the VK as unrestricted and returns all models.- Provider filtering is done by
pc.Provider == string(modelProvider)withmodelProviderparsed frommodel.IDviaParseModelString, so only configured providers are included whenProviderConfigsis non-empty.- Model filtering honors
allowed_modelsper provider, accepting either bare model names (modelName) or fully qualified IDs (model.ID), and treating an emptyallowed_modelsslice as “all models for that provider”.This matches both the tests and the
/v1/modelsdocs. The logic is straightforward and avoids nil panics on slices. One optional improvement would be to add a brief comment noting that unprefixed model IDs (no"provider/"prefix) are intentionally excluded wheneverProviderConfigsis non-empty, to make that edge case explicit for future maintainers.Also applies to: 261-319
docs/features/governance/virtual-keys.mdx (1)
18-25: Docs correctly describe VK-based model filtering (minor JSON nit)The updated “Access Control” feature and
/v1/modelssection line up with the implementation: header presence, provider_configs scoping, and the emptyprovider_configs/allowed_modelsbehavior all match the handler logic and tests. As a minor polish, consider moving the explanatory// GET /v1/models ...line outside the```jsonblock or changing the block type to plain ` ```` so the example isn’t syntactically invalid JSON.Also applies to: 502-528
tests/governance/test_virtual_keys_access.py (1)
18-307: Test coverage for VK-based/v1/modelsfiltering is solid (consider adding request timeouts)The scenarios here (specific
allowed_models, emptyallowed_models, emptyprovider_configs, nonexistent providers, multiple providers) closely exercise the new filtering logic and match the documented behavior. One practical improvement is to pass a smalltimeout=to allrequests.get(...)calls so CI hangs don’t turn into indefinite waits (also satisfying the S113 lint). For example, you could define a shared constant, e.g.REQUEST_TIMEOUT = 10, and userequests.get(..., timeout=REQUEST_TIMEOUT)across the suite.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
docs/features/governance/virtual-keys.mdx(2 hunks)tests/governance/README.md(3 hunks)tests/governance/test_virtual_keys_access.py(1 hunks)transports/bifrost-http/handlers/inference.go(3 hunks)transports/bifrost-http/server/server.go(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
transports/bifrost-http/handlers/inference.go (6)
transports/bifrost-http/lib/config.go (2)
HandlerStore(35-38)Config(139-169)plugins/governance/store.go (1)
GovernanceStore(18-30)core/schemas/models.go (1)
Model(109-129)core/schemas/bifrost.go (1)
BifrostContextKeyVirtualKey(105-105)core/schemas/utils.go (1)
ParseModelString(21-34)framework/configstore/clientconfig.go (1)
ClientConfig(30-42)
transports/bifrost-http/server/server.go (3)
plugins/governance/store.go (1)
GovernanceStore(18-30)plugins/governance/main.go (1)
GovernancePlugin(41-60)transports/bifrost-http/handlers/inference.go (1)
NewInferenceHandler(35-42)
🪛 Ruff (0.14.5)
tests/governance/test_virtual_keys_access.py
28-28: Probable use of requests call without timeout
(S113)
70-70: Probable use of requests call without timeout
(S113)
103-103: Probable use of requests call without timeout
(S113)
138-138: Probable use of requests call without timeout
(S113)
163-163: Probable use of requests call without timeout
(S113)
183-183: Probable use of requests call without timeout
(S113)
221-221: Probable use of requests call without timeout
(S113)
235-235: Probable use of requests call without timeout
(S113)
292-292: Probable use of requests call without timeout
(S113)
320-320: Unused method argument: governance_client
(ARG002)
322-322: Probable use of requests call without timeout
(S113)
330-330: Unused method argument: governance_client
(ARG002)
332-332: Probable use of requests call without timeout
(S113)
337-337: Probable use of requests call without timeout
(S113)
353-353: Probable use of requests call without timeout
(S113)
388-388: Probable use of requests call without timeout
(S113)
409-409: Probable use of requests call without timeout
(S113)
457-457: Probable use of requests call without timeout
(S113)
463-463: Local variable filtered_ids is assigned to but never used
Remove assignment to unused variable filtered_ids
(F841)
🔇 Additional comments (2)
transports/bifrost-http/handlers/inference.go (1)
350-408: Conditional hook into VK filtering is appropriately scopedThe
listModelshandler now only appliesfilterModelsByVirtualKeywhenClientConfig.EnableGovernanceis true, and passes the samebifrostCtxused for the core list call along with the sharedgovernanceStore. Given the early-return guards infilterModelsByVirtualKey, this is safe even if the governance plugin failed to initialize or the VK is missing/invalid, and it keeps behavior fully backward compatible when governance is disabled or the header is absent.transports/bifrost-http/server/server.go (1)
843-856: GovernanceStore wiring into inference routes is safe and non-intrusive
RegisterInferenceRoutesnow looks up the governance plugin, pulls itsGovernanceStoreif available, and passes it intoNewInferenceHandler. Ignoring the error fromFindPluginByNameand defaultinggovernanceStoretonilkeeps this path safe when the plugin is disabled or failed to load; combined with handler-side guards, inference remains fully functional without governance. This is a clean way to expose governance state to/v1/modelswithout impacting other routes.
|
Thanks @KUNJ1311 for the contribution ❤️ |
- Remove unused governance_client fixture from tests that don't need it - Remove unused filtered_ids variable in test_list_models_with_mixed_provider_configs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
tests/governance/test_virtual_keys_access.py (3)
24-96: Provider-config filtering test is thorough; helper extraction would be purely cosmeticThe logic for discovering a provider with ≥2 models, creating the VK, and asserting both inclusion and exclusion (plus basic schema fields) is solid and captures the intended behavior well. If this provider/model discovery pattern ends up repeated elsewhere, consider extracting it into a small helper (or using
collections.defaultdict(list)forproviders_map) for readability, but there’s no need to change anything for correctness here.
28-459: Consider adding explicit timeouts torequests.getcalls in testsAll HTTP calls here use
requests.get(...)without atimeout, which triggers Ruff S113 and can leave tests hanging indefinitely if the server stops responding. Even for integration tests, adding a small explicit timeout (e.g., via a module-levelREQUEST_TIMEOUT = 5and usingrequests.get(..., timeout=REQUEST_TIMEOUT), or a helper inconftest) makes the suite more robust and should satisfy the linter.
330-402: Tighten expectations for invalid/inactive VK behavior if “fail-safe 200 + unfiltered” is a hard requirementThese tests currently accept
200,401, or403for invalid and inactive VKs, only asserting “unfiltered models” on the 200 path. The PR description, however, says fail-safe behavior for invalid/inactive VKs is to return unfiltered models, which implies a 200-success path rather than 4xx. If that behavior is indeed contractual, you may want to restrict the assertion tostatus_code == 200and always compare returned IDs to the baseline set so the tests will catch any future deviation. If different deployments are allowed to return 4xx here, leaving the current flexible assertions is fine.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/governance/test_virtual_keys_access.py(1 hunks)
🧰 Additional context used
🪛 Ruff (0.14.5)
tests/governance/test_virtual_keys_access.py
28-28: Probable use of requests call without timeout
(S113)
70-70: Probable use of requests call without timeout
(S113)
103-103: Probable use of requests call without timeout
(S113)
138-138: Probable use of requests call without timeout
(S113)
163-163: Probable use of requests call without timeout
(S113)
183-183: Probable use of requests call without timeout
(S113)
221-221: Probable use of requests call without timeout
(S113)
235-235: Probable use of requests call without timeout
(S113)
292-292: Probable use of requests call without timeout
(S113)
322-322: Probable use of requests call without timeout
(S113)
332-332: Probable use of requests call without timeout
(S113)
337-337: Probable use of requests call without timeout
(S113)
353-353: Probable use of requests call without timeout
(S113)
388-388: Probable use of requests call without timeout
(S113)
409-409: Probable use of requests call without timeout
(S113)
457-457: Probable use of requests call without timeout
(S113)
🔇 Additional comments (1)
tests/governance/test_virtual_keys_access.py (1)
405-478: Mixed provider-configs test cleanly validates combined “restricted + unrestricted” behaviorThis test nicely exercises the case where one provider is constrained to a single model and another is unrestricted, and the assertions on
expected_count, per-provider membership, and exact equality for the second provider look correct. The prior unusedfiltered_idsissue mentioned in earlier reviews is resolved here, and the structure is clear and maintainable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
core/bifrost.go (1)
300-419: ListModelsFromProviders ignores request.ExtraParams, diverging from single-provider behaviourThe helper looks solid overall (concurrency, error aggregation, and MaxPaginationRequests handling), but two nuances are worth calling out:
- ExtraParams are dropped for multi-provider fetches
ListModelsFromProviderscurrently constructs a freshBifrostListModelsRequestper provider with onlyProviderandPageSize: schemas.DefaultPageSize, ignoringrequest.ExtraParams(and other fields):providerRequest := &schemas.BifrostListModelsRequest{ Provider: providerKey, PageSize: schemas.DefaultPageSize, }In contrast,
ListModelsRequestpreservesExtraParamswhen forwarding to the provider. For/v1/models, that means provider-specific query params are respected whenlen(providersToFetch) == 1but silently ignored when multiple providers are queried viaListModelsFromProviders.If the intent is to allow provider-specific filters even in the multi-provider case (or at least when all providers share semantics), consider threading ExtraParams through:
- providerRequest := &schemas.BifrostListModelsRequest{ - Provider: providerKey, - PageSize: schemas.DefaultPageSize, - } + providerRequest := &schemas.BifrostListModelsRequest{ + Provider: providerKey, + PageSize: schemas.DefaultPageSize, + ExtraParams: request.ExtraParams, + }If multi-provider + ExtraParams is intentionally unsupported, it’s worth documenting that contract on the function to avoid surprises later.
🧹 Nitpick comments (2)
transports/bifrost-http/handlers/inference.go (2)
21-41: Governance wiring and VK provider resolution behaviourThe new governance integration is generally well-shaped:
- Injecting
governanceStore *governance.GovernanceStoreintoCompletionHandlerand wiring it viaNewInferenceHandlerkeeps concerns localized while allowing HTTP-layer filtering.getVirtualKeyProviderscorrectly implements the intended semantics:
- Returns
nilwhen governance is disabled, the context is nil, the VK header is absent/empty, or the VK is missing/invalid/inactive.- Returns an empty but non-nil map when
vk.ProviderConfigsis empty, whichlistModelsthen interprets as “no VK restrictions”.- Produces
map[schemas.ModelProvider][]stringfor non-empty provider_configs, giving you a clean structure to drive both provider and model-level filtering.This matches the documented design decisions: backward compatibility when VK is absent, and a fail-open behaviour when governance is disabled or VK is invalid/inactive.
Also applies to: 261-295, 370-374
448-453: Provider-not-found handling and pricing guardTwo smaller notes:
- Provider-not-found special casing
if bifrostErr.Error != nil && bifrostErr.Error.Message == "provider not found for list models request" { SendJSON(ctx, &schemas.BifrostListModelsResponse{ Data: []schemas.Model{} }) return }Returning an empty
dataarray with 200 for an unknown provider is reasonable, but this couples the HTTP layer to an exact error message string. If you later change the wording inListModelsRequest, this behaviour silently changes. Long-term, a more robust approach would be to key off a typed error code orErrorField.Typeinstead ofMessage, but that’s non-blocking here.
- Skipping pricing overwrite when provider already set Pricing
The new early-continue on
modelEntry.Pricing != nilbefore consultingPricingManager:for i, modelEntry := range resp.Data { if modelEntry.Pricing != nil { continue } ... resp.Data[i].Pricing = pricing }is a sensible safeguard to avoid clobbering provider-supplied pricing with catalog defaults. This keeps the existing augmentation behaviour for models without Pricing while deferring to the provider when Pricing is present.
Both changes look fine as-is; consider only the error-typing refinement if/when you introduce structured error codes for this path.
Also applies to: 463-469
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
core/bifrost.go(3 hunks)transports/bifrost-http/handlers/inference.go(4 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
transports/bifrost-http/handlers/inference.go (6)
transports/bifrost-http/lib/config.go (2)
HandlerStore(35-38)Config(139-169)plugins/governance/store.go (1)
GovernanceStore(18-30)core/schemas/bifrost.go (2)
ModelProvider(32-32)BifrostContextKeyVirtualKey(105-105)core/schemas/models.go (3)
Model(109-129)BifrostListModelsResponse(36-45)Pricing(139-148)core/schemas/utils.go (1)
ParseModelString(21-34)transports/bifrost-http/handlers/utils.go (3)
SendError(35-44)SendJSON(16-22)SendBifrostError(47-62)
core/bifrost.go (2)
core/schemas/bifrost.go (3)
ModelProvider(32-32)BifrostError(353-362)ErrorField(371-378)core/schemas/models.go (2)
BifrostListModelsRequest(23-34)BifrostListModelsResponse(36-45)
🔇 Additional comments (2)
core/bifrost.go (1)
421-445: ListAllModels delegation + pagination looks consistentThe refactor to delegate to
ListModelsFromProvidersand then applyresponse.ApplyPagination(request.PageSize, request.PageToken)keeps the previous contract forListAllModelswhile centralizing the concurrent fetch logic. This looks correct and keeps the public API behaviour stable.transports/bifrost-http/handlers/inference.go (1)
296-325: Review comment is based on incorrect assumptions about model ID formatThe flagged concern assumes some providers might return non-namespaced model IDs, but verification shows all providers consistently namespace IDs as
provider/model: OpenAI, Bedrock, Azure, Cohere, Anthropic, and Gemini all use this pattern when constructingModel.IDvalues. SinceParseModelStringcorrectly extracts the provider from prefixed IDs, the filtering logic works as intended without hardening.Likely an incorrect or invalid review comment.
…n in listModels handler
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
core/bifrost.go (1)
300-420: Nil-dereference risk when filtering provider errors inListModelsFromProvidersThe loop around Line 353 assumes
bifrostErr.Erroris non-nil:if bifrostErr != nil { if !strings.Contains(bifrostErr.Error.Message, "no keys found") && !strings.Contains(bifrostErr.Error.Message, "not supported") { providerErr = bifrostErr bifrost.logger.Warn(...) } break }If any provider ever returns a
*schemas.BifrostErrorwith a nilErrorfield, this will panic. The rest of the codebase generally setsError, but this function should be defensive.You can guard and still keep the special-casing:
- if !strings.Contains(bifrostErr.Error.Message, "no keys found") && - !strings.Contains(bifrostErr.Error.Message, "not supported") { - providerErr = bifrostErr - bifrost.logger.Warn(fmt.Sprintf("failed to list models for provider %s: %s", providerKey, GetErrorMessage(bifrostErr))) - } + if bifrostErr.Error == nil { + // No structured error; treat as generic failure and log once + providerErr = bifrostErr + bifrost.logger.Warn(fmt.Sprintf("failed to list models for provider %s: %s", providerKey, GetErrorMessage(bifrostErr))) + } else if !strings.Contains(bifrostErr.Error.Message, "no keys found") && + !strings.Contains(bifrostErr.Error.Message, "not supported") { + providerErr = bifrostErr + bifrost.logger.Warn(fmt.Sprintf("failed to list models for provider %s: %s", providerKey, GetErrorMessage(bifrostErr))) + }This preserves the “no keys found” / “not supported” suppression but avoids a crash if
Erroris unexpectedly nil.transports/bifrost-http/handlers/inference.go (1)
370-448: Missing test coverage and documentation for "provider not found" behavior changeThe semantic change to return HTTP 200 with empty data for unknown providers (instead of error) is confirmed in the code at lines 451–456, but verification found critical gaps:
Test Coverage: No tests exist for this scenario. The existing test in
tests/core-providers/scenarios/list_models.goonly covers the happy path and would actually fail if an unknown provider returned empty data (it assertslen(response.Data) == 0as a failure).Documentation: The OpenAPI spec (
docs/apis/openapi.json) documents only HTTP 200, 400, 500 responses but does not explain when 200 with empty data is returned versus when an error occurs. No docs reflect this behavior change.Required:
- Add tests explicitly covering both governance-enabled and governance-disabled cases with unknown providers (verify returns 200 empty, not error)
- Update OpenAPI documentation to clarify the "provider not found" behavior and when it returns 200 vs error
- If this behavior is intentional, ensure it's reflected consistently across all integration endpoints
🧹 Nitpick comments (3)
core/bifrost.go (1)
300-305: ClarifyListModelsFromProviderspagination semantics in doc commentThe new helper ignores
request.PageSizeandrequest.PageTokenfor upstream calls (it always usesDefaultPageSizeand walks pages until exhaustion /MaxPaginationRequests), and then callers likeListAllModelsapply global pagination viaApplyPagination:providerRequest := &schemas.BifrostListModelsRequest{ Provider: providerKey, PageSize: schemas.DefaultPageSize, ExtraParams: request.ExtraParams, } ... response = response.ApplyPagination(request.PageSize, request.PageToken)This behavior is reasonable for aggregating multiple providers, but it’s non-obvious from the function name/signature that
ListModelsFromProvidersreturns the full aggregated set and that the caller is responsible for applying final pagination.I’d suggest tightening the comment to avoid misuse, e.g.:
-// ListModelsFromProviders fetches models concurrently from specified providers +// ListModelsFromProviders fetches and fully paginates models concurrently from the specified providers. +// It returns the complete aggregated list (sorted by ID); callers are responsible for applying any +// cross-provider pagination (e.g., via ApplyPagination on the returned response).This keeps the external behavior consistent with
ListAllModelsand helps future call sites understand that request.PageSize/PageToken are not honored per-provider here.Also applies to: 422-447
transports/bifrost-http/handlers/inference.go (2)
261-325: VK helper semantics are sound; be explicit about model ID/provider assumptions
getVirtualKeyProvidersandfilterModelsByAllowedModelsnicely encode the intended governance semantics:
- Missing / nil governanceStore, missing VK, inactive VK →
nilmap → no filtering.- VK with empty
ProviderConfigs→ empty but non-nil map → treated as “no restrictions”.- For a given provider:
- Empty
AllowedModelsslice → all models for that provider allowed.- Non-empty slice → matches either
allowedModel == modelNameorallowedModel == model.ID.One potential gotcha:
filterModelsByAllowedModelsinfers the provider frommodel.IDvia:modelProvider, modelName := schemas.ParseModelString(model.ID, "") ... allowedModels, providerAllowed := allowedProviders[modelProvider]This assumes model IDs are consistently of the form
provider/model. If any provider returns plain IDs without a prefix (e.g."gpt-4o"),modelProviderwill be"", no entry inallowedProviderswill match, and those models will always be filtered out for VKs that specify provider restrictions.If that pattern isn’t globally enforced, consider augmenting the provider detection to fall back to other fields (e.g.,
TopProvider/OwnedBy) or to using the provider implied by the source of the list, e.g.:
- Pass the provider alongside each model from core into this filter, or
- Store provider on the model in a dedicated field and read from there.
That would make VK filtering robust even if some providers don’t encode the provider in
Model.ID.
466-487: Respecting existingPricingon models before augmenting is correctThe new guard:
if modelEntry.Pricing != nil { continue }ensures you don’t overwrite provider-supplied pricing when attaching catalog-based pricing metadata. The rest of the loop still keys off
ParseModelString(modelEntry.ID, "")andPricingManager.GetPricingEntryForModel, so behavior is backward-compatible for models without explicit pricing.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
core/bifrost.go(4 hunks)transports/bifrost-http/handlers/inference.go(4 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
transports/bifrost-http/handlers/inference.go (8)
transports/bifrost-http/lib/config.go (2)
HandlerStore(35-38)Config(139-169)plugins/governance/store.go (1)
GovernanceStore(18-30)core/schemas/bifrost.go (3)
ModelProvider(32-32)BifrostContextKeyVirtualKey(105-105)ListModelsRequest(86-86)core/schemas/provider.go (1)
Provider(207-234)core/schemas/models.go (3)
Model(109-129)BifrostListModelsResponse(36-45)Pricing(139-148)core/schemas/utils.go (1)
ParseModelString(21-34)framework/configstore/clientconfig.go (1)
ClientConfig(30-42)transports/bifrost-http/handlers/utils.go (3)
SendError(35-44)SendJSON(16-22)SendBifrostError(47-62)
core/bifrost.go (3)
core/schemas/bifrost.go (3)
ModelProvider(32-32)BifrostError(353-362)ErrorField(371-378)core/schemas/models.go (3)
BifrostListModelsRequest(23-34)BifrostListModelsResponse(36-45)DefaultPageSize(11-11)core/schemas/provider.go (1)
Provider(207-234)
🔇 Additional comments (1)
transports/bifrost-http/handlers/inference.go (1)
21-24: Governance store wiring intoCompletionHandlerlooks correctInjecting
*governance.GovernanceStoreintoCompletionHandlerand threading it viaNewInferenceHandleris clean, and downstream VK helpers already nil-check the store, so this is safe when governance is disabled or the plugin is not configured.Also applies to: 28-32, 35-41
|
@akshaydeo I have two questions:
Let me know your preference and I'll adjust accordingly! |
Summary
Implements model filtering for the
/v1/modelsendpoint based on virtual key'sprovider_configs. This allows controlling which models are visible to specific virtual keys by specifying allowed models, providing fine-grained access control at the API level.Changes
filterModelsByVirtualKey()function intransports/bifrost-http/handlers/inference.goto filter models based on virtual key's provider_configslistModelshandler for/v1/modelsendpoint (applied only when governance is enabled)NewInferenceHandler()to accept governance store parameter and addedgovernanceStorefield toCompletionHandlerstructserver/server.go) to inject governance store into inference handlertest_virtual_keys_access.py) with 9 test cases covering all filtering scenariostests/governance/README.mdanddocs/features/governance/virtual-keys.mdxwith detailed/v1/modelsfiltering behaviorDesign decisions:
EnableGovernanceconfig is trueprovider_configs= unrestricted access, emptyallowed_models= all models from providerType of change
Affected areas
How to test
1. Start Bifrost server
2. Run the test suite
cd tests/governance pytest test_virtual_keys_access.py -vExpected outcome: All 9 tests pass
test_list_models_with_vk_filters_by_provider_config- Filters models by VK with specific allowed_modelstest_list_models_with_empty_allowed_models_returns_all_provider_models- Empty allowed_models returns all models from that providertest_list_models_without_provider_configs_returns_all- Empty provider_configs returns all models (unrestricted)test_list_models_with_nonexistent_provider_returns_empty- Non-existent providers return empty listtest_list_models_with_multiple_providers- Filters models from multiple providers in provider_configstest_list_models_without_vk_header_returns_all- Without VK header returns all models (backward compatible)test_list_models_with_invalid_vk_header- Invalid VK header returns all models without filteringtest_list_models_with_inactive_vk- Inactive VK returns all models without filteringtest_list_models_with_mixed_provider_configs- One provider with specific models, another with all models4. Manual testing
Create a virtual key with allowed models:
List models:
Expected behavior:
provider_configsis empty → all modelsallowed_modelsis empty for a provider → all models from that providerallowed_modelshas specific models → only those modelsScreenshots/Recordings
N/A - Backend API feature
Breaking changes
This is a backward-compatible addition. Existing behavior (returning all models without VK header) is preserved.
Related issues
[Feature]: filter models in /v1/models based on virtual key #808
Security considerations
EnableGovernanceis true in configChecklist
docs/contributing/README.mdand followed the guidelines