Skip to content

Conversation

@repeat-Q
Copy link
Contributor

PR Title

fix(openai): normalize tool_choice inputs across providers (fix #34129)

PR Description

Summary

  • Add a small core helper normalize_tool_choice to canonicalize user-provided tool_choice values (e.g. "any" -> "required", True -> "required"), and demonstrate usage in the OpenAI adapter by calling it from ChatOpenAI.bind_tools.
  • Fixes "Forcing tool calls" is not universal. #34129.

Files changed

  • Added: tool_choice.py
  • Modified: base.py (use normalized tool_choice in bind_tools)
  • Added test: test_tool_choice.py

Motivation

  • Different provider adapters interpret or support tool_choice differently (for example, some require "required", others ignore "any"). This leads to inconsistent behavior across models.
  • The change introduces a conservative, centralized normalization step so adapters can rely on a consistent canonical input; adapters may still apply provider-specific mappings on top of this canonical value.

Details

  • normalize_tool_choice performs conservative mappings only (e.g. "any" -> "required", True -> "required", leaves "auto", "none", dicts, and specific tool names as-is).
  • The OpenAI adapter now calls this helper before performing its existing provider-specific mapping logic. This is intentionally minimal and non-breaking.

Testing & How to run locally

  • Unit tests added to cover common inputs: "any", True, False, None, and "auto".
  • Recommended local verification:
    pip install -e .
    pytest -q libs/core/tests/unit_tests/test_tool_choice.py -q
    make format
    make lint
    make test
    Note: CI requires make format, make lint, and make test to pass.

Compatibility

  • Backwards-compatible: existing provider mappings are preserved (OpenAI will still receive "required" where appropriate).
  • Providers that do not support tool_choice (e.g., those documented to ignore it) will continue to behave as before. The recommendation is to gradually adopt the normalization helper across adapters for consistent user experience.

Next steps (suggested)

  • Apply the same normalization pattern to other major adapters (e.g. HuggingFace, MistralAI, Groq) in follow-up PRs to fully standardize behavior across providers. Prefer separate PRs per package to match repo contribution guidelines.

AI assistant disclosure

  • I used generative AI assistance to help draft the implementation and PR text; the final code and tests were authored and verified by the contributor.

…ean branch)

Create clean branch off remote master with only the three intended changes. Fixes langchain-ai#34129
@github-actions github-actions bot added integration Related to a provider partner package integration core Related to the package `langchain-core` openai fix labels Nov 28, 2025
@codspeed-hq
Copy link

codspeed-hq bot commented Nov 28, 2025

CodSpeed Performance Report

Merging #34135 will degrade performances by 24.65%

Comparing repeat-Q:fix/34129-normalize-tool-choice (7ad1eef) with master (12df938)

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

Summary

❌ 1 regression
✅ 12 untouched
⏩ 21 skipped1

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Mode Benchmark BASE HEAD Change
WallTime test_async_callbacks_in_sync 18.6 ms 24.7 ms -24.65%

Footnotes

  1. 21 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Related to the package `langchain-core` fix integration Related to a provider partner package integration openai

Projects

None yet

Development

Successfully merging this pull request may close these issues.

"Forcing tool calls" is not universal.

1 participant