feat: add LiteLLM as unified AI gateway provider#1521
Open
RheagalFire wants to merge 3 commits intoopen-compass:mainfrom
Open
feat: add LiteLLM as unified AI gateway provider#1521RheagalFire wants to merge 3 commits intoopen-compass:mainfrom
RheagalFire wants to merge 3 commits intoopen-compass:mainfrom
Conversation
…lm dep - Move litellm_api import to correct alphabetical position in __init__.py - Skip empty text values in _prepare_content when images are present (matches Bedrock/GPT pattern) - Place litellm_kwargs spread before explicit params so runtime overrides take precedence - Add litellm>=1.55,<1.85 to requirements.txt
Author
|
cc @kennymckormick would like your review |
Add 4 more config entries to showcase LiteLLM's cross-provider vision support: Gemini 2.5 Pro, Bedrock Claude 3.5 Sonnet, Llama 3.2 Vision (Together AI), and Llama 4 Scout (Groq).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
LiteLLMAPI, a new API provider backed by LiteLLM, enabling access to 100+ LLM providers (OpenAI, Anthropic, Azure, Bedrock, Vertex AI, Together, Groq, etc.) through a single unified interface.TogetherAPIandBedrockAPI. Additive only, existing providers untouched.Motivation
VLMEvalKit currently requires a separate provider file for each LLM backend. Users who want to evaluate models across Azure, Bedrock, or Vertex AI need provider-specific code. LiteLLM provides a unified
completion()interface that handles auth, formatting, and provider-specific quirks, enabling cross-provider evaluation with a single configuration change.Changes
vlmeval/api/litellm_api.py-- newLiteLLMAPIprovider extendingBaseAPIvlmeval/api/__init__.py-- import +__all__registrationvlmeval/config.py-- 8 model presets across 5 providers (OpenAI, Anthropic, Google, AWS Bedrock, Together AI, Groq)requirements.txt-- addedlitellm>=1.55,<1.85tests/test_litellm_api.py-- 22 unit tests covering init, content prep, message prep, generate_inner, error handling, registrationKey implementation details
litellm>=1.55,<1.85, lazy-imported at module level withtry/except(same pattern asbedrock.pywithboto3). Base install unaffected.drop_params=Trueby default, silently drops provider-unsupported kwargs (e.g.seed/stricton Anthropic,response_formaton Bedrock). Prevents cross-provider evaluation failures.key=param,LITELLM_API_KEYenv var, or provider-specific env vars (OPENAI_API_KEY,ANTHROPIC_API_KEY,AZURE_API_KEY, etc.).litellm_kwargspassthrough for advanced settings (seed,top_p, provider-specific params).Usage and Testing
1. Unit tests (22/22 pass):
tests/test_litellm_api.py::TestLiteLLMAPIInit::test_default_params PASSED
tests/test_litellm_api.py::TestLiteLLMAPIInit::test_custom_params PASSED
tests/test_litellm_api.py::TestLiteLLMAPIInit::test_key_from_env PASSED
tests/test_litellm_api.py::TestLiteLLMAPIInit::test_key_param_overrides_env PASSED
tests/test_litellm_api.py::TestPrepareContent::test_text_only PASSED
tests/test_litellm_api.py::TestPrepareContent::test_image_and_text PASSED
tests/test_litellm_api.py::TestPrepareMessages::test_flat_inputs PASSED
tests/test_litellm_api.py::TestPrepareMessages::test_system_prompt PASSED
tests/test_litellm_api.py::TestPrepareMessages::test_role_based_inputs PASSED
tests/test_litellm_api.py::TestGenerateInner::test_success PASSED
tests/test_litellm_api.py::TestGenerateInner::test_drop_params_default_true PASSED
tests/test_litellm_api.py::TestGenerateInner::test_api_key_forwarded PASSED
tests/test_litellm_api.py::TestGenerateInner::test_api_key_omitted_when_none PASSED
tests/test_litellm_api.py::TestGenerateInner::test_api_base_forwarded PASSED
tests/test_litellm_api.py::TestGenerateInner::test_error_returns_negative_one PASSED
tests/test_litellm_api.py::TestGenerateInner::test_litellm_not_installed PASSED
tests/test_litellm_api.py::TestGenerateInner::test_temperature_override PASSED
tests/test_litellm_api.py::TestGenerateInner::test_max_tokens_override PASSED
tests/test_litellm_api.py::TestGenerateInner::test_litellm_kwargs_passthrough PASSED
tests/test_litellm_api.py::TestConfigRegistration::test_litellm_entries_in_config PASSED
tests/test_litellm_api.py::TestConfigRegistration::test_litellm_in_init_all PASSED
tests/test_litellm_api.py::TestConfigRegistration::test_version_pin_in_docstring PASSED
======================== 22 passed in 1.29s =========================
2. Live E2E against Azure GPT-4o-mini (text):
LIVE E2E TEST 1: Azure GPT-4o-mini (text)
Model: azure/gpt-4o-mini
Query: What is 2+2? Reply with just the number.
ret_code: 0
answer: 4
model: gpt-4o-mini-2024-07-18
usage: prompt=20, completion=2, total=22
LIVE E2E TEST 2: Azure GPT-4o-mini (multi-turn with system prompt)
Model: azure/gpt-4o-mini
System: You are a math tutor. Be concise.
Query: What is the square root of 144?
ret_code: 0
answer: The square root of 144 is 12.
model: gpt-4o-mini-2024-07-18
usage: prompt=29, completion=11, total=40
3. Live E2E vision test (Anthropic Claude Sonnet 4-6 via Azure):
[Test 1] Text-only completion
Model: anthropic/claude-sonnet-4-6
Prompt: 'What is 2+2? Reply with just the number.'
Answer: 4
Tokens: in=20 out=5
PASSED
[Test 2] Vision -- real image via _prepare_content pipeline
Model: anthropic/claude-sonnet-4-6
Image: BA50EF10-8F5D-4719-BA18-B20A80EF5A8F.png
Answer: I see a cute cartoon character (a round white figure) holding a
glass, sitting next to two bottles of Jack Daniel's whiskey.
Tokens: in=38 out=40
PASSED
[Test 3] VLMEvalKit input format -- dict list with type/value
Model: anthropic/claude-sonnet-4-6
Input: [{'type': 'image', 'value': ''}, {'type': 'text', 'value': '...'}]
Answer: cartoon character, glass, Jack Daniel's whiskey bottle (x2),
wooden box/crate, liquid
Tokens: in=38 out=63
PASSED
================================================================
All 3 tests passed -- text + vision + VLMEvalKit format
4. Lint:
flake8 --max-line-length 99-> all clean.Example usage