Skip to content

Releases: deepset-ai/haystack

v2.21.0-rc1

03 Dec 20:33

Choose a tag to compare

v2.21.0-rc1 Pre-release
Pre-release

Release Notes

v2.21.0-rc1

Upgrade Notes

  • Updated the default Azure OpenAI model from gpt-4o-mini to gpt-4.1-mini and the default API version from 2023-05-15 to 2024-12-01-preview for both AzureOpenAIGenerator and AzureOpenAIChatGenerator.
  • The default OpenAI model has been changed from gpt-4o-mini to gpt-5-mini for OpenAIChatGenerator and OpenAIGenerator. If you rely on the default model and need to continue using gpt-4o-mini, explicitly specify it when initializing these components: OpenAIChatGenerator(model="gpt-4o-mini").

New Features

  • Three new components are added QueryExpander, MultiQueryEmbeddingRetriever, MultiQueryTextRetriever. When used together, they allow a query to be expanded and each expansion is used to retrieve a potentially different set of documents.

Enhancement Notes

  • Added a return_empty_on_no_match parameter (default True) to RegexTextExtractor.\_\_init\_\_(). When set to False, the component returns {"captured_text": ""} instead of {} when no regex match is found. Provides a consistent output structure for pipeline integration.
  • The FilterRetriever and AutoMergingRetriever components now support asynchronous execution.
  • Previously, when using tracing with objects like ByteStream and ImageContent, the payload sent to the tracing backend could become too large, hitting provider limits or causing performance degradation. We now replace these objects with string placeholders to avoid oversized payloads.
  • The default OpenAI model for OpenAIChatGenerator and OpenAIGenerator has been updated from gpt-4o-mini to gpt-5-mini.

Bug Fixes

  • Ensure request header keys are unique in link_content to prevent 400 Bad Request errors.

    Some image providers return a 400 Bad Request when using ImageContent.from_url() because the User-Agent header appears multiple times with different casing (e.g., user-agent, User-Agent). This update normalizes header keys in a case-insensitive way, removes duplicates, and preserves only the last occurrence.

  • Fixed a bug where components explicitly listed in include_outputs_from would not appear in the pipeline results if they returned an empty dictionary. Now, any component specified in include_outputs_from will be included in the results regardless of whether its output is empty.

  • Fix the serialization and deserialization of pipeline_outputs in pipeline_snapshot and make it use the same schema as the rest of the pipeline state when running pipelines with breakpoints. The deserialization of the older format of pipeline_outputs without serialization schema is supported till Haystack 2.23.0.

  • Fixed ToolInvoker missing tools after warmup for lazy-initialized toolsets. The invoker now refreshes its tool registry post-warmup, ensuring replaced placeholders (e.g., MCPToolset with eager_connect=False) resolve to the actual tool names at invocation time.

💙 Big thank you to everyone who contributed to this release!

@Amnah199, @anakin87, @davidsbatista, @dfokina, @mrchtr, @OscarPindaro, @schwartzadev, @sjrl, @TaMaN2031A, @vblagoje, @YassineGabsi, @ZeJ0hn

v2.20.0

13 Nov 15:06

Choose a tag to compare

⭐️ Highlights

Support for OpenAI's Responses API

Haystack now integrates the OpenAI's Responses API through the new OpenAIResponsesChatGenerator and AzureOpenAIResponsesChatGenerator components.

This unlocks several advanced capabilities like:

  • Retrieving concise summaries of the model’s reasoning process.
  • Using native OpenAI or MCP tool formats alongside Haystack Tool objects and Toolset instances.

Example with reasoning and a web search tool:

from haystack.components.generators.chat import AzureOpenAIResponsesChatGenerator
from haystack.dataclasses import ChatMessage

# with `OpenAIResponsesChatGenerator`
chat_generator = OpenAIResponsesChatGenerator(
    model="o3-mini",
    generation_kwargs={"summary": "auto", "effort": "low"},
    tools=[{"type": "web_search"}],
)
response = chat_generator.run(messages=[ChatMessage.from_user("What's a positive news story from today?")])

# with `AzureOpenAIResponsesChatGenerator`
chat_generator = AzureOpenAIResponsesChatGenerator(
    azure_endpoint="https://example-resource.azure.openai.com/",
    azure_deployment="gpt-5-mini",
    generation_kwargs={"reasoning": {"effort": "low", "summary": "auto"}},
)
response = chat_generator.run(messages=[ChatMessage.from_user("What's Natural Language Processing?")])

print(response["replies"][0].text)

🚀 New Features

  • Added the AzureOpenAIResponsesChatGenerator, a new component that integrates Azure OpenAI's Responses API into Haystack.
  • Added the OpenAIResponsesChatGenerator, a new component that integrates OpenAI's Responses API into Haystack.
  • If logprobs are enabled in the generation kwargs, return logprobs in ChatMessage.meta for OpenAIChatGenerator and OpenAIResponsesChatGenerator.
  • Added an extra field to ToolCall and ToolCallDelta to store provider-specific information.
  • Updated serialization and deserialization of PipelineSnapshots to work with pydantic BaseModels.
  • Added async support to SentenceWindowRetriever with a new run_async() method, allowing the retriever to be used in async pipelines and workflows.
  • Added warm_up() method to all ChatGenerator components (OpenAIChatGenerator, AzureOpenAIChatGenerator, HuggingFaceAPIChatGenerator, HuggingFaceLocalChatGenerator, and FallbackChatGenerator) to properly initialize tools that require warm-up before pipeline execution. The warm_up() method is idempotent and follows the same pattern used in Agent and ToolInvoker components. This enables proper tool initialization in pipelines that use ChatGenerators with tools but without an Agent component.
  • The AnswerBuilder component now exposes a new parameter return_only_referenced_documents (default: True) that controls if only documents referenced in the replies are returned. Returned documents include two new fields in the meta dictionary:
    • source_index: the 1-based index of the document in the input list
    • referenced: a boolean value indicating if the document was referenced in the replies (only present if the reference_pattern parameter is provided).
      These additions make it easier to display references and other sources within a RAG pipeline.

⚡️ Enhancement Notes

  • Adds generation_kwargs to the Agent component, allowing for more fine-grained control at run-time over chat generation.
  • Added a revision parameter to all Sentence Transformers embedder components (SentenceTransformersDocumentEmbedder, SentenceTransformersTextEmbedder, SentenceTransformersSparseDocumentEmbedder, and SentenceTransformersSparseTextEmbedder) to allow users to specify a specific model revision/version from the Hugging Face Hub. This enables pinning to a particular model version for reproducibility and stability.
  • Updated the components Agent, LLMMetadataExtractor, LLMMessagesRouter, and LLMDocumentContentExtractor to automatically call self.warm_up() at runtime if they have not been warmed up yet. This ensures that the components are ready for use without requiring an explicit warm-up call. This differs from previous behavior where warm-up had to be manually invoked before use, otherwise a RuntimeError was raised.
  • Improved log-trace correlation for DatadogTracer by using the official ddtrace.tracer.get_log_correlation_context() method.
  • Improved Toolset warm-up architecture for better encapsulation. The base Toolset.warm_up() method now warms up all tools by default, while subclasses can override it to customize initialization (e.g., setting up shared resources instead of warming individual tools). The warm_up_tools() utility function has been simplified to delegate to Toolset.warm_up().

🐛 Bug Fixes

  • Fixed deserialization of state schema when it is None in Agent.from_dict.

  • Fixed a bug where components explicitly listed in include_outputs_from would not appear in the pipeline results if they returned an empty dictionary. Now, any component specified in include_outputs_from will be included in the results regardless of whether its output is empty.

  • Fixed type compatibility issue where passing list[Tool] to components with a tools parameter (such as ToolInvoker) caused static type checker errors.
    In version 2.19, the ToolsType was changed to Union[list[Union[Tool, Toolset]], Toolset] to support mixing Tools and Toolsets. However, due to Python's list invariance, list[Tool] was no longer considered compatible with list[Union[Tool, Toolset]], breaking type checking for the common pattern of passing a list of Tool objects.

    The fix explicitly lists all valid type combinations in ToolsType: Union[list[Tool], list[Toolset], list[Union[Tool, Toolset]], Toolset]. This preserves backward compatibility for existing code while still supporting the new functionality of mixing Tools and Toolsets.

    Users who encountered type errors like "Argument of type 'list[Tool]' cannot be assigned to parameter 'tools'" should no longer see these errors after upgrading. No code changes are required on the user side.

  • When creating a pipeline snapshot, we now ensure use of _deepcopy_with_exceptions when copying component inputs to avoid deep copies of items like components and tools since they often contain attributes that are not deep-copyable.
    For example, the LinkContentFetcher has httpx.Client as an attribute, which throws an error if deep-copied.

💙 Big thank you to everyone who contributed to this release!

@Amnah199, @anakin87, @cmnemoi, @davidsbatista, @dfokina, @HamidOna, @Hansehart, @jdb78, @mrchtr, @sjrl, @swapniel99, @TaMaN2031A, @tstadel, @vblagoje

v2.20.0-rc2

13 Nov 10:55

Choose a tag to compare

v2.20.0-rc2 Pre-release
Pre-release

Release Notes

v2.20.0-rc2

Bug Fixes

  • Fixed a bug where components explicitly listed in include_outputs_from would not appear in the pipeline results if they returned an empty dictionary. Now, any component specified in include_outputs_from will be included in the results regardless of whether its output is empty.

v2.21.0-rc1

New Features

  • Added the AzureOpenAIResponsesChatGenerator, a new component that integrates Azure OpenAI's Responses API into Haystack. This unlocks several advanced capabilities from the Responses API:

    • Allowing retrieval of concise summaries of the model's reasoning process.
    • Allowing the use of native OpenAI or MCP tool formats, along with Haystack Tool objects and Toolset instances.

    Example with reasoning and web search tool:

    from haystack.components.generators.chat import AzureOpenAIResponsesChatGenerator
    from haystack.dataclasses import ChatMessage  
    
    chat_generator = AzureOpenAIResponsesChatGenerator(azure_endpoint="https://example-resource.azure.openai.com/", azure_deployment="gpt-5-mini", generation_kwargs={"reasoning": {"effort": "low", "summary": "auto"}}, )  
    response = chat_generator.run(messages=[ChatMessage.from_user("What's Natural Language Processing?")] )
    
    print(response["replies"][0].text) 
  • Add an extra field to ToolCall and ToolCallDelta to store provider-specific information.

  • If logprobs are enabled in the generation kwargs, return logprobs in ChatMessage.meta for OpenAIChatGenerator and OpenAIResponsesChatGenerator.

  • Added the OpenAIResponsesChatGenerator, a new component that integrates OpenAI's Responses API into Haystack. This unlocks several advanced capabilities from the Responses API:

    • Allowing retrieval of concise summaries of the model's reasoning process.
    • Allowing the use of native OpenAI or MCP tool formats, along with Haystack Tool objects and Toolset instances.

    Example with reasoning and web search tool:

    from haystack.components.generators.chat import OpenAIResponsesChatGenerator 
    from haystack.dataclasses import ChatMessage  
    chat_generator = OpenAIResponsesChatGenerator( model="o3-mini", generation_kwargs={ {"summary": "auto", "effort": "low"}     },tools=[{"type": "web_search"}] ) 
    response = chat_generator.run( messages=[ ChatMessage.from_user("What's a positive news story from today?")] )
    print(response["replies"][0].text) 

    Example with structured output:

from pydantic import BaseModel 
from haystack.components.generators.chat import OpenAIResponsesChatGenerator 
from haystack.dataclasses import ChatMessage  

class WeatherInfo(BaseModel):     
     location: str 
     temperature: float 
     conditions: str  


chat_generator = OpenAIResponsesChatGenerator(model="gpt-5-mini", generation_kwargs={"text_format": WeatherInfo} ) 

response = chat_generator.run(messages=[ChatMessage.from_user("What's the weather in Paris?")] ) 
  • Updated our serialization and deserialization of PipelineSnapshots to work with pydantic BaseModels

  • Added async support to SentenceWindowRetriever with a new run_async() method, allowing the retriever to be used in async pipelines and workflows.

  • Added warm_up() method to all ChatGenerator components (OpenAIChatGenerator, AzureOpenAIChatGenerator, HuggingFaceAPIChatGenerator, HuggingFaceLocalChatGenerator, and FallbackChatGenerator) to properly initialize tools that require warm-up before pipeline execution. The warm_up() method is idempotent and follows the same pattern used in Agent and ToolInvoker components. This enables proper tool initialization in pipelines that use ChatGenerators with tools but without an Agent component.

  • The AnswerBuilder component now exposes a new parameter return_only_referenced_documents (default: True) that controls if only documents referenced in the replies are returned. Returned documents include two new fields in the meta dictionary:

    • `source_index`: the 1-based index of the document in the input list

    - `referenced`: a boolean value indicating if the document was referenced in the replies (only present if the reference_pattern parameter is provided). These additions make it easier to display references and other sources within a RAG pipeline.

Enhancement Notes

  • Adds generation_kwargs to the Agent component, allowing for more fine-grained control at run-time over the chat generation.
  • Added a revision parameter to all Sentence Transformers embedder components (SentenceTransformersDocumentEmbedder, SentenceTransformersTextEmbedder, SentenceTransformersSparseDocumentEmbedder, and SentenceTransformersSparseTextEmbedder) to allow users to specify a specific model revision/version from the Hugging Face Hub. This enables pinning to a particular model version for reproducibility and stability.
  • Updated the components Agent, LLMMetadataExtractor, LLMMessagesRouter, and LLMDocumentContentExtractor to automatically call self.warm_up() at runtime if they have not been warmed up yet. This ensures that the components are ready for use without requiring an explicit warm-up call. This differs from previous behavior where the warm-up had to be manually invoked before using these components otherwise they would raise a RuntimeError.
  • Improve log-trace correlation for DatadogTracer by using the official ddtrace.tracer.get_log_correlation_context() method.
  • Improved Toolset warm-up architecture for better encapsulation. The base Toolset.warm_up() method now warms up all tools by default, while subclasses can override it to customize initialization (e.g., setting up shared resources instead of warming individual tools). The warm_up_tools() utility function has been simplified to delegate to Toolset.warm_up().

Bug Fixes

  • Fix deserialization of state schema when it is None in Agent.from_dict.

  • Fixed type compatibility issue where passing list[Tool] to components with a tools parameter (such as ToolInvoker) caused static type checker errors. In version 2.19, the ToolsType was changed to Union[list[Union[Tool, Toolset]], Toolset] to support mixing Tools and Toolsets. However, due to Python's list invariance, list[Tool] was no longer considered compatible with list[Union[Tool, Toolset]], breaking type checking for the common pattern of passing a list of Tool objects.

    The fix explicitly lists all valid type combinations in `ToolsType`: Union[list[Tool], list[Toolset], list[Union[Tool, Toolset]], Toolset]. This preserves backward compatibility for existing code while still supporting the new functionality of mixing Tools and Toolsets.

    Users who encountered type errors like "Argument of type 'list[Tool]' cannot be assigned to parameter 'tools'" should no longer see these errors after upgrading. No code changes are required on the user side.

  • When creating a pipeline snapshot we make sure to use _deepcopy_with_exceptions when copying component inputs to avoid deep copies of items like components and tools since they often contain attributes that are not deep-copyable. For example, the LinkContentFetcher has httpx.Client as an attribute which throws an error if we try to deep copy it.

v2.20.0-rc1

11 Nov 14:59

Choose a tag to compare

v2.20.0-rc1 Pre-release
Pre-release

Release Notes

v2.21.0-rc0

New Features

  • Added the AzureOpenAIResponsesChatGenerator, a new component that integrates Azure OpenAI's Responses API into Haystack. This unlocks several advanced capabilities from the Responses API:

    • Allowing retrieval of concise summaries of the model's reasoning process.
    • Allowing the use of native OpenAI or MCP tool formats, along with Haystack Tool objects and Toolset instances.

    Example with reasoning and web search tool:

    from haystack.components.generators.chat import AzureOpenAIResponsesChatGenerator
    from haystack.dataclasses import ChatMessage  
    
    chat_generator = AzureOpenAIResponsesChatGenerator(azure_endpoint="https://example-resource.azure.openai.com/", azure_deployment="gpt-5-mini", generation_kwargs={"reasoning": {"effort": "low", "summary": "auto"}}, )  
    response = chat_generator.run(messages=[ChatMessage.from_user("What's Natural Language Processing?")] )
    
    print(response["replies"][0].text) 
  • Add an extra field to ToolCall and ToolCallDelta to store provider-specific information.

  • If logprobs are enabled in the generation kwargs, return logprobs in ChatMessage.meta for OpenAIChatGenerator and OpenAIResponsesChatGenerator.

  • Added the OpenAIResponsesChatGenerator, a new component that integrates OpenAI's Responses API into Haystack. This unlocks several advanced capabilities from the Responses API:

    • Allowing retrieval of concise summaries of the model's reasoning process.
    • Allowing the use of native OpenAI or MCP tool formats, along with Haystack Tool objects and Toolset instances.

    Example with reasoning and web search tool:

    from haystack.components.generators.chat import OpenAIResponsesChatGenerator 
    from haystack.dataclasses import ChatMessage  
    chat_generator = OpenAIResponsesChatGenerator( model="o3-mini", generation_kwargs={ {"summary": "auto", "effort": "low"}     },tools=[{"type": "web_search"}] ) 
    response = chat_generator.run( messages=[ ChatMessage.from_user("What's a positive news story from today?")] )
    print(response["replies"][0].text) 

    Example with structured output:

from pydantic import BaseModel 
from haystack.components.generators.chat import OpenAIResponsesChatGenerator 
from haystack.dataclasses import ChatMessage  

class WeatherInfo(BaseModel):     
     location: str 
     temperature: float 
     conditions: str  


chat_generator = OpenAIResponsesChatGenerator(model="gpt-5-mini", generation_kwargs={"text_format": WeatherInfo} ) 

response = chat_generator.run(messages=[ChatMessage.from_user("What's the weather in Paris?")] ) 
  • Updated our serialization and deserialization of PipelineSnapshots to work with pydantic BaseModels

  • Added async support to SentenceWindowRetriever with a new run_async() method, allowing the retriever to be used in async pipelines and workflows.

  • Added warm_up() method to all ChatGenerator components (OpenAIChatGenerator, AzureOpenAIChatGenerator, HuggingFaceAPIChatGenerator, HuggingFaceLocalChatGenerator, and FallbackChatGenerator) to properly initialize tools that require warm-up before pipeline execution. The warm_up() method is idempotent and follows the same pattern used in Agent and ToolInvoker components. This enables proper tool initialization in pipelines that use ChatGenerators with tools but without an Agent component.

  • The AnswerBuilder component now exposes a new parameter return_only_referenced_documents (default: True) that controls if only documents referenced in the replies are returned. Returned documents include two new fields in the meta dictionary:

    • `source_index`: the 1-based index of the document in the input list

    - `referenced`: a boolean value indicating if the document was referenced in the replies (only present if the reference_pattern parameter is provided). These additions make it easier to display references and other sources within a RAG pipeline.

Enhancement Notes

  • Adds generation_kwargs to the Agent component, allowing for more fine-grained control at run-time over the chat generation.
  • Added a revision parameter to all Sentence Transformers embedder components (SentenceTransformersDocumentEmbedder, SentenceTransformersTextEmbedder, SentenceTransformersSparseDocumentEmbedder, and SentenceTransformersSparseTextEmbedder) to allow users to specify a specific model revision/version from the Hugging Face Hub. This enables pinning to a particular model version for reproducibility and stability.
  • Updated the components Agent, LLMMetadataExtractor, LLMMessagesRouter, and LLMDocumentContentExtractor to automatically call self.warm_up() at runtime if they have not been warmed up yet. This ensures that the components are ready for use without requiring an explicit warm-up call. This differs from previous behavior where the warm-up had to be manually invoked before using these components otherwise they would raise a RuntimeError.
  • Improve log-trace correlation for DatadogTracer by using the official ddtrace.tracer.get_log_correlation_context() method.
  • Improved Toolset warm-up architecture for better encapsulation. The base Toolset.warm_up() method now warms up all tools by default, while subclasses can override it to customize initialization (e.g., setting up shared resources instead of warming individual tools). The warm_up_tools() utility function has been simplified to delegate to Toolset.warm_up().

Bug Fixes

  • Fix deserialization of state schema when it is None in Agent.from_dict.

  • Fixed type compatibility issue where passing list[Tool] to components with a tools parameter (such as ToolInvoker) caused static type checker errors. In version 2.19, the ToolsType was changed to Union[list[Union[Tool, Toolset]], Toolset] to support mixing Tools and Toolsets. However, due to Python's list invariance, list[Tool] was no longer considered compatible with list[Union[Tool, Toolset]], breaking type checking for the common pattern of passing a list of Tool objects.

    The fix explicitly lists all valid type combinations in `ToolsType`: Union[list[Tool], list[Toolset], list[Union[Tool, Toolset]], Toolset]. This preserves backward compatibility for existing code while still supporting the new functionality of mixing Tools and Toolsets.

    Users who encountered type errors like "Argument of type 'list[Tool]' cannot be assigned to parameter 'tools'" should no longer see these errors after upgrading. No code changes are required on the user side.

  • When creating a pipeline snapshot we make sure to use _deepcopy_with_exceptions when copying component inputs to avoid deep copies of items like components and tools since they often contain attributes that are not deep-copyable. For example, the LinkContentFetcher has httpx.Client as an attribute which throws an error if we try to deep copy it.

v2.19.0

20 Oct 12:53

Choose a tag to compare

⭐️ Highlights

🛡️ Try Multiple LLMs with FallbackChatGenerator

Introduced FallbackChatGenerator, a resilient chat generator that runs multiple LLMs sequentially and automatically falls back when one fails. It tries each generator in order until one succeeds, handling errors like timeouts, rate limits, or server issues. Ideal for building robust, production-grade chat systems that stay responsive across providers.

from haystack.dataclasses import ChatMessage
from haystack_integrations.components.generators.google_genai import GoogleGenAIChatGenerator
from haystack_integrations.components.generators.anthropic import AnthropicChatGenerator
from haystack.components.generators.chat.openai import OpenAIChatGenerator
from haystack.components.generators.chat.fallback import FallbackChatGenerator

anthropic_generator = AnthropicChatGenerator(model="claude-sonnet-4-5", timeout=1) # force failure with low timeout
google_generator = GoogleGenAIChatGenerator(model="gemini-2.5-flashy") # force failure with typo in model name
openai_generator = OpenAIChatGenerator(model="gpt-4o-mini") # success

chat_generator = FallbackChatGenerator(chat_generators=[anthropic_generator, google_generator, openai_generator])
response = chat_generator.run(messages=[ChatMessage.from_user("What is the plot twist in Shawshank Redemption?")])

print("Successful ChatGenerator: ", response["meta"]["successful_chat_generator_class"])
print("Response: ", response["replies"][0].text)

Output:

WARNING:haystack.components.generators.chat.fallback:ChatGenerator AnthropicChatGenerator failed with error: Request timed out or interrupted...
WARNING:haystack.components.generators.chat.fallback:ChatGenerator GoogleGenAIChatGenerator failed with error: Error in Google Gen AI chat generation: 404 NOT_FOUND...
Successful ChatGenerator:   OpenAIChatGenerator
Response:  In "The Shawshank Redemption," ....

🛠️ Mix Tool and Toolset in Agents

You can now combine both Tool and Toolset objects in the same tools list for Agent and ToolInvoker components. This update brings more flexibility, letting you organize tools into logical groups while still adding standalone tools in one go.

from haystack.components.agents import Agent
from haystack.tools import Tool, Toolset

math_toolset = Toolset([add_tool, multiply_tool])
weather_toolset = Toolset([weather_tool, forecast_tool])

agent = Agent(
    chat_generator=generator,
    tools=[math_toolset, weather_toolset, calendar_tool],  # ✨ Now supported!
)

⚙️ Faster Agents with Tool Warmup

Tool and Toolset objects can now perform initialization during Agent or ToolInvoker warmup. This allows setup tasks such as connecting to databases, loading models, or initializing connection pools before the first use.

from haystack.tools import Toolset
from haystack.components.agents import Agent

# Custom toolset with initialization needs
class DatabaseToolset(Toolset):
    def __init__(self, connection_string):
        self.connection_string = connection_string
        self.pool = None
        super().__init__([query_tool, update_tool])
        
    def warm_up(self):
        # Initialize connection pool
        self.pool = create_connection_pool(self.connection_string)

🚀 New Features

  • Updated our serialization and deserialization of PipelineSnapshots to work with python Enum classes.

  • Added FallbackChatGenerator that automatically retries different chat generators and returns first successful response with detailed information about which providers were tried.

  • Added pipeline_snapshot and pipeline_snapshot_file_path parameters to BreakpointException to provide more context when a pipeline breakpoint is triggered.
    Added pipeline_snapshot_file_path parameter to PipelineRuntimeError to include a reference to the stored pipeline snapshot so it can be easily found.

  • A new component RegexTextExtractor which allows to extract text from chat messages or strings input based on custom regex pattern.

  • CSVToDocument: add conversion_mode='row' with optional content_column; each row becomes a Document; remaining columns stored in meta; default 'file' mode preserved.

  • Added the ability to resume an Agent from an AgentSnapshot while specifying a new breakpoint in the same run call. This allows stepwise debugging and precise control over chat generator inputs tool inputs before execution, improving flexibility when inspecting intermediate states. This addresses a previous limitation where passing both a snapshot and a breakpoint simultaneously would throw an exception.

  • Introduce SentenceTransformersSparseTextEmbedder and SentenceTransformersSparseDocumentEmbedder components. These components embed text and documents using sparse embedding models compatible with Sentence Transformers. Sparse embeddings are interpretable, efficient when used with inverted indexes, combine classic information retrieval with neural models, and are complementary to dense embeddings. Currently, the produced SparseEmbedding objects are compatible with the QdrantDocumentStore.

    Usage example:

    from haystack.components.embedders import SentenceTransformersSparseTextEmbedder
    
    text_embedder = SentenceTransformersSparseTextEmbedder()
    text_embedder.warm_up()
    
    print(text_embedder.run("I love pizza!"))
    # {'sparse_embedding': SparseEmbedding(indices=[999, 1045, ...], values=[0.918, 0.867, ...])}
  • Added a warm_up() function to the Tool dataclass, allowing tools to perform resource-intensive initialization before execution. Tools and Toolsets can now override the warm_up() method to establish connections to remote services, load models, or perform other preparatory operations. The ToolInvoker and Agent automatically call warm_up() on their tools during their own warm-up phase, ensuring tools are ready before use.

  • Fixed a serialization issue related to function objects in a pipeline; now they are converted to type None (functions cannot be serialized). This was preventing the successful setting of breakpoints in agents and their use as a resume point. If an error occurs during an Agent execution, for instance, during tool calling. In that case, a snapshot of the last successful step is raised, allowing the caller to catch it to inspect the possible reason for the crash and use it to resume the pipeline execution from that point onwards.

⚡️ Enhancement Notes

  • Added tools to agent run parameters to enhance the agent's flexibility. Users can now choose a subset of tools for the agent at runtime by providing a list of tool names, or supply an entirely new set by passing Tool objects or a Toolset.
  • Enhanced the tools parameter across all tool-accepting components (Agent, ToolInvoker, OpenAIChatGenerator, AzureOpenAIChatGenerator, HuggingFaceAPIChatGenerator, HuggingFaceLocalChatGenerator) to accept either a mixed list of Tool and Toolset objects or just a Toolset object. Previously, components required either a list of Tool objects OR a single Toolset, but not both in the same list. Now users can organize tools into logical Toolsets while also including standalone Tool objects, providing greater flexibility in tool organization. For example: Agent(chat_generator=generator, tools=[math_toolset, weather_toolset, standalone_tool]). This change is fully backward compatible and preserves structure during serialization/deserialization, enabling proper round-trip support for mixed tool configurations.
  • Refactored _save_pipeline_snapshot to consolidate try-except logic and added a raise_on_failure option to control whether save failures raise an exception or are logged. _create_pipeline_snapshot now wraps _serialize_value_with_schema in try-except blocks to prevent failures from non-serializable pipeline inputs.

🐛 Bug Fixes

  • Fix Agent run_async method to correctly handle async streaming callbacks. This previously triggered errors due to a bug.
  • Prevent duplication of the last assistant message in the chat history when initializing from an AgentSnapshot.
  • We were setting response_format to None in OpenAIChatGenerator by default which doesn't follow the API spec. We now omit the variable if response_format is not passed by the user.
  • Ensure that the OpenAIChatGenerator is properly serialized when response_format in generation_kwargs is provided as a dictionary (for example, {"type": "json_object"}). Previously, this caused serialization errors.
  • Fixed parameter schema generation in ComponentTool when using inputs_from_state. Previously, parameters were only removed from the schema if the state key and parameter name matched exactly. For example, inputs_from_state={"text": "text"} removed text as expected, but inputs_from_state={"state_text": "text"} did not. This is now resolved, and such cases work as intended.
  • Refactored SentenceTransformersEmbeddingBackend to ensure unique embedding IDs by incorporating all relevant arguments.
  • Fixed Agent to correctly raise a BreakpointException when a ToolBreakpoint with a specific tool_name is provided in an assistant chat message containing multiple tool calls.
  • The OpenAIChatGenerator implementation uses ChatCompletionMessageCustomToolCall, which is only available in OpenAI client >=1.99.2. We now require openai>=1.99.2.

💙 Big thank you to everyone who contributed to this release!

@anakin87, @bilgeyucel, @davidsbatista, @dfokina, @...

Read more

v2.19.0-rc1

20 Oct 10:37

Choose a tag to compare

v2.19.0-rc1 Pre-release
Pre-release
v2.19.0-rc1

v2.18.1

29 Sep 09:43

Choose a tag to compare

Release Notes

v2.18.1

⚡️ Enhancement Notes

  • Added tools to agent run parameters to enhance the agent's flexibility. Users can now choose a subset of tools for the agent at runtime by providing a list of tool names, or supply an entirely new set by passing Tool objects or a Toolset.

🐛 Bug Fixes

  • Fix Agent run_async method to correctly handle async streaming callbacks. This previously triggered errors due to a bug.
  • Prevent duplication of the last assistant message in the chat history when initializing from an AgentSnapshot.
  • We were setting response_format to None in OpenAIChatGenerator by default which doesn't follow the API spec. We now omit the variable if response_format is not passed by the user.

v2.18.0

22 Sep 14:45

Choose a tag to compare

⭐️ Highlights

🔁 Pipeline Error Recovery with Snapshots

Pipelines now capture a snapshot of the last successful step when a run fails, including intermediate outputs. This lets you diagnose issues (e.g., failed tool calls), fix them, and resume from the checkpoint instead of restarting the entire run. Currently supported for synchronous Pipeline and Agent (not yet in AsyncPipeline)

The snapshot is part of the exception raised with the PipelineRuntimeError when the pipeline run fails. You need to wrap your pipeline.run() in a try-except block.

try:
  pipeline.run(data=input_data)
except PipelineRuntimeError as exc_info
	snapshot = exc_info.value.pipeline_snapshot
	intermediate_outputs = pipeline_snapshot.pipeline_state.pipeline_outputs

# Snapshot can be used to resume the execution of a Pipeline by passing it to the run() method using the snapshot argument	
pipeline.run(data={}, snapshot=saved_snapshot)

🧠 Structured Outputs for OpenAI/Azure OpenAI

OpenAIChatGenerator and AzureOpenAIChatGenerator support structured outputs via response_format (Pydantic model or JSON schema).

from pydantic import BaseModel
from haystack.components.generators.chat.openai import OpenAIChatGenerator
from haystack.dataclasses import ChatMessage

class CalendarEvent(BaseModel):
    event_name: str
    event_date: str
    event_location: str

generator = OpenAIChatGenerator(generation_kwargs={"response_format": CalendarEvent})

message = "The Open NLP Meetup is going to be in Berlin at deepset HQ on September 19, 2025"
result = generator.run([ChatMessage.from_user(message)])
print(result["replies"][0].text)

# {"event_name":"Open NLP Meetup","event_date":"September 19","event_location":"deepset HQ, Berlin"}

🛠️ Convert Pipelines into Tools with PipelineTool

The new PipelineTool lets you expose entire Haystack Pipelines as LLM-compatible tools. It simplifies the previous SuperComponent + ComponentTool pattern into a single abstraction and directly exposes input_mapping and output_mapping for fine-grained control.

from haystack import Pipeline
from haystack.tools import PipelineTool

retrieval_pipeline = Pipeline()
retrieval_pipeline.add_component...
..

retrieval_tool = PipelineTool(
    pipeline=retrieval_pipeline,
    input_mapping={"query": ["bm25_retriever.query"]},
    output_mapping={"ranker.documents": "documents"},
    name="retrieval_tool",
    description="Use to retrieve documents",
)

🗺️ Runtime System Prompt for Agents

Agent’s system_prompt can now be updated dynamically at runtime for more flexible behavior.

🚀 New Features

  • OpenAIChatGenerator and AzureOpenAIChatGenerator now support structured outputs using response_format parameter that can be passed in generation_kwargs. The response_format parameter can be a Pydantic model or a JSON schema for non-streaming responses. For streaming responses, the response_format must be a JSON schema. Example usage of the response_format parameter:

    from pydantic import BaseModel
    from haystack.components.generators.chat import OpenAIChatGenerator
    from haystack.dataclasses import ChatMessage
    
    class NobelPrizeInfo(BaseModel):
        recipient_name: str
        award_year: int
        category: str
        achievement_description: str
        nationality: str
    
    client = OpenAIChatGenerator(
        model="gpt-4o-2024-08-06",
        generation_kwargs={"response_format": NobelPrizeInfo}
    )
    
    response = client.run(messages=[
        ChatMessage.from_user("In 2021, American scientist David Julius received the Nobel Prize in"
        " Physiology or Medicine for his groundbreaking discoveries on how the human body"
        " senses temperature and touch.")
    ])
    print(response["replies"][0].text)
    >>> {"recipient_name":"David Julius","award_year":2021,"category":"Physiology or Medicine","achievement_description":"David Julius was awarded for his transformative findings regarding the molecular mechanisms underlying the human body's sense of temperature and touch. Through innovative experiments, he identified specific receptors responsible for detecting heat and mechanical stimuli, ranging from gentle touch to pain-inducing pressure.","nationality":"American"}
  • Added PipelineTool, a new tool wrapper that allows Haystack Pipelines to be exposed as LLM-compatible tools.

    • Previously, this was achievable by first wrapping a pipeline in a SuperComponent and then passing it to ComponentTool.
    • PipelineTool streamlines that pattern into a dedicated abstraction. It uses the same approach under the hood but directly exposes input_mapping and output_mapping so users can easily control which pipeline inputs and outputs are made available.
    • Automatically generates input schemas for LLM tool calling from pipeline inputs.
    • Extracts descriptions from underlying component docstrings for better tool documentation.
    • Can be passed directly to an Agent, enabling seamless integration of full pipelines as tools in multi-step reasoning workflows.
  • Add a reasoning field to StreamingChunk that optionally takes in a ReasoningContent dataclass. This is to allow a structured way to pass reasoning contents to streaming chunks.

  • If an error occurs during the execution of a pipeline, the pipeline will raise an PipelineRuntimeError exception containing an error message and the components outputs up to the point of failure. This allows you to inspect and debug the pipeline up to the point of failure.

  • LinkContentFetcher: add request_headers to allow custom per-request HTTP headers. Header precedence: httpx client defaults → component defaults → request_headers → rotating User-Agent. Also make HTTP/2 handling import-safe: if h2 isn’t installed, fall back to HTTP/1.1 with a warning. Thanks @xoaryaa. (Fixes #9064)

  • A snapshot of the last successful step is also raised when an error occurs during a Pipeline run. Allowing the caller to catch it to inspect the possible reason for crash and use it to resume the pipeline execution from that point onwards.

  • Add exclude_subdomains parameter to SerperDevWebSearch component. When set to True, this parameter restricts search results to only the exact domains specified in allowed_domains, excluding any subdomains. For example, with allowed_domains=\["example.com"\] and exclude_subdomains=True, results from "blog.example.com" or "shop.example.com" will be filtered out, returning only results from "example.com". The parameter defaults to False to maintain backward compatibility with existing behavior.

⚡️ Enhancement Notes

  • Added system_prompt to agent run parameters to enhance customization and control over agent behavior.
  • The internal Agent logic was refactored to help with readability and maintanability. This should help developers understand and extend the internal Agent logic moving forward.

🐛 Bug Fixes

  • Reintroduce verbose error message when deserializing a ChatMessage with invalid content parts. While LLMs may still generate messages in the wrong format, this error provides guidance on the expected structure, making retries easier and more reliable during agent runs. The error message was unintentionally removed during a previous refactoring.
  • The English and German abbreviation files used by the SentenceSplitter are now included in the distribution. They were previously missing due to a config in the .gitignore file.
  • Preserve explicit lambda_threshold=0.0 in SentenceTransformersDiversityRanker instead of overriding it with 0.5 due to short-circuit evaluation.
  • Fix MetaFieldGroupingRanker to still work when subgroup_by values are unhashable types like list. We handle this by stringfying the contents of doc.meta\[subgroup_by\] in the same we do this for values of doc.meta\[group_by\].
  • Fixed missing trace parentage for tools executed via the synchronous ToolInvoker path. Updated ToolInvoker.run() to propagate contextvars into ThreadPoolExecutor workers, ensuring all tool spans (ComponentTool, Agent wrapped in ComponentTool, or custom tools) are correctly linked to the outer Agent's trace instead of starting new root traces. This improves end-to-end observability across the entire tool execution chain.
  • Fixed the from_dict method of MetadataRouter so the output_type parameter introduced in Haystack 2.17 is now optional when loading from YAML. This ensures compatibility with older Haystack pipelines.
  • In OpenAIChatGenerator, improved the logic to exclude unsupported custom tool calls. The previous implementation caused compatibility issues with the Mistral Haystack core integration, which extends OpenAIChatGenerator.
  • Fixed parameter schema generation in ComponentTool when using inputs_from_state. Previously, parameters were only removed from the schema if the state key and parameter name matched exactly. For example, inputs_from_state={"text": "text"} removed text as expected, but inputs_from_state={"state_text": "text"} did not. This is now resolved, and such cases work as intended.

💙 Big thank you to everyone who contributed to this release!

@Amnah199, @Ujjwal-Bajpayee, @abdokaseb, @anakin87, @davidsbatista, @dfokina, @rigved-telang, @sjrl, @tstadel, @vblagoje, @xoaryaa

v2.18.0-rc1

17 Sep 14:01

Choose a tag to compare

v2.18.0-rc1 Pre-release
Pre-release
v2.18.0-rc1

v2.17.1

20 Aug 09:18

Choose a tag to compare

Release Notes

v2.17.1

Bug Fixes

  • Fixed the from_dict method of MetadataRouter so the output_type parameter introduced in Haystack 2.17 is now optional when loading from YAML. This ensures compatibility with older Haystack pipelines.
  • In OpenAIChatGenerator, improved the logic to exclude unsupported custom tool calls. The previous implementation caused compatibility issues with the Mistral Haystack core integration, which extends OpenAIChatGenerator.