Skip to content

[Bug]: Gemini 2.5 Flash multi-turn function calling fails with "Please ensure that the number of function response parts is equal to the number of function call parts" #17949

@igor-thunk-ai

Description

@igor-thunk-ai

What happened?

Environment:

  • LiteLLM Version: v1.80.5-stable
  • Model: gemini/gemini-2.5-flash
  • Provider: Google Gemini API
  • OpenAI SDK compatibility mode

Description:

When using Gemini 2.5 Flash with multi-turn function calling (2+ rounds of tool calls), LiteLLM's conversion from OpenAI format to Gemini native format causes validation errors:

Please ensure that the number of function response parts is equal to the number of function call parts

Suspected root cause:

Gemini 2.5 Flash returns thought signatures in assistant responses, which LiteLLM properly captures in:

  1. Tool call IDs as call_xxx__thought__
  2. provider_specific_fields.thought_signature in tool_calls

However, when these messages are sent back in subsequent turns, LiteLLM re-extracts thought signatures from the embedded tool_call IDs (via _get_thought_signature_from_tool()) and
adds them as separate parts in the Gemini native format:

From LiteLLM source: litellm/llms/gemini/chat/transformation.py

parts.append({"thoughtSignature": thought_signature})
parts.append({"functionCall": {...}})

This creates 2 parts for each tool call in assistant messages, but only 1 part for each tool response, causing the validation error.

Steps to Reproduce:

  1. Use Gemini 2.5 Flash with function calling enabled
  2. Make an initial tool call (works fine)
  3. Return tool response
  4. LLM makes another tool call in the same conversation
  5. Error occurs: validation fails due to mismatched parts count

Expected Behavior:

Multi-turn function calling not causing validation error from Gemini when payload passed to litellm is valid.

Actual Behavior:

Assistant messages get 2 parts per tool call (thought signature + function call), but tool response messages get 1 part (function response only), causing validation errors.

Relevant Code:

The issue stems from litellm/llms/gemini/chat/transformation.py:

  • _get_thought_signature_from_tool() extracts signatures from tool_call IDs
  • These signatures are added as separate parts even in historical messages

Temporary workaround:

Disable thinking mode for Gemini requests so no thought signature is present:
if (model.startsWith("gemini")) {
request = {
...request,
thinking: { type: "disabled", budget_tokens: 0 },
};
}

Relevant log output

Are you a ML Ops Team?

No

What LiteLLM version are you on ?

v1.80.5-stable

Twitter / LinkedIn details

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions