Add OpenAI Responses API Support #5037

bedrin · 2025-12-04T19:30:58Z

Overview

This PR adds support for OpenAI's new Responses API to the OpenAiApi class, providing low-level access to OpenAI's latest agentic capabilities. The Responses API represents OpenAI's unified interface for building agent-like applications with built-in tools, multi-turn conversations, and enhanced reasoning capabilities.

Important: This PR adds support at the low-level API layer only (OpenAiApi class). It does not integrate with the high-level ChatModel abstractions. The Responses API appears to be a stateful, standalone application (OpenAI's latest agentic attempt) rather than a traditional chat model. It doesn't fit the existing ChatModel abstractions and isn't easily integrated as another chat-model provider. It represents a new agentic category entirely.

Related Issues

Closes Support for new Open AI Responses API #4221 - Support for OpenAI Responses API
Related to Support for OpenAI Responses API #2962 - Enhanced reasoning model support
Related to Allow using OpenAI tools like 'web_search_preview' #3022 - Multi-turn conversation handling

Changes

1. Core API Support (`OpenAiApi.java`)

Added DTOs

Request DTO - ResponseRequest:

Comprehensive request object with 24 parameters
Parameters include: model, input, instructions, temperature, tools, reasoning, conversation, previousResponseId, etc.
Supports all Responses API features: reasoning models, built-in tools, structured outputs, multi-turn conversations
Includes nested records: TextConfig, TextFormat, ReasoningConfig

Response DTO - Response:

Complete response structure with id, status, model, output, usage, etc.
Nested records: OutputItem, ContentItem, ReasoningDetails, ResponseError, IncompleteDetails
Supports multiple output types: messages, reasoning, tool calls

Streaming DTO - ResponseStreamEvent:

Event-based streaming support
Includes: type, sequenceNumber, response, delta, text, etc.
Enables real-time processing of responses

Added Methods

responseEntity(ResponseRequest) - Synchronous response creation
responseEntity(ResponseRequest, HttpHeaders) - Synchronous with custom headers
responseStream(ResponseRequest) - Streaming response creation
responseStream(ResponseRequest, HttpHeaders) - Streaming with custom headers

Added Configuration

responsesPath field (default: /v1/responses)
Builder support for responses path configuration
Updated constructors to include responses path

2. Autoconfiguration Support

`OpenAiChatProperties.java`

Added responsesPath property with default value /v1/responses
Added getter/setter methods
Follows same pattern as completionsPath and embeddingsPath

`OpenAiChatAutoConfiguration.java`

Updated openAiApi() bean to include .responsesPath(chatProperties.getResponsesPath())
Enables Spring Boot property configuration

`OpenAiEmbeddingAutoConfiguration.java`

Updated openAiApi() method to include responses path
Uses default constant for consistency

3. Integration Tests (`OpenAiApiIT.java`)

Added 4 comprehensive integration tests:

responseEntity() - Basic synchronous response
- Tests simple request/response flow
- Validates response structure and content
- Cost: ~10-20 tokens
responseStream() - Streaming responses
- Tests event stream processing
- Validates multiple event types
- Cost: ~10-20 tokens
responseWithInstructionsAndConfiguration() - Advanced configuration
- Tests system instructions and parameters
- Validates parameter echo and content accuracy
- Cost: ~10-20 tokens
responseWithWebSearchTool() - Built-in web_search tool
- Demonstrates built-in tool usage (no custom implementation needed)
- Tests tool execution and response handling
- Validates output structure with tool calls
- Cost: ~30-50 tokens

Total estimated cost: ~$0.0002 - $0.0005 per test run

4. Unit Tests (`ResponsesApiTest.java`)

Added comprehensive unit tests covering:

ResponseRequest creation with various parameter combinations
Response structure validation
ResponseStreamEvent structure validation
Convenience constructors

5. Documentation Updates

`openai-chat.adoc`

Added spring.ai.openai.chat.responses-path property documentation
Updated Chat Completions API references for clarity
Changed note about Responses API availability (now supported via OpenAiApi)

Key Features

Built-in Tools

The Responses API provides tools without custom implementation:

web_search - Search the internet (demonstrated in integration test)
file_search - Search through uploaded files
code_interpreter - Execute Python code
computer_use - Interact with computer interfaces
Remote MCPs (Model Context Protocol)

Multi-turn Conversations

Native support for stateful conversations:

Via previousResponseId parameter
Via conversation object/ID

Reasoning Models

Enhanced support for reasoning models (gpt-5, o-series):

Configurable reasoning effort levels
Access to reasoning content and summaries

Structured Outputs

JSON schema validation via TextConfig:

Type-safe structured responses
Schema validation with strict mode

Configuration

Default Configuration (Minimal)

spring:
  ai:
    openai:
      api-key: ${OPENAI_API_KEY}

Custom Configuration

spring:
  ai:
    openai:
      api-key: ${OPENAI_API_KEY}
      chat:
        responses-path: /v1/responses  # Can be customized for compatible servers

Usage Examples

Basic Synchronous Request

@Autowired
private OpenAiApi openAiApi;

public void example() {
    var request = new OpenAiApi.ResponseRequest("What is AI?", "gpt-4o");
    ResponseEntity<OpenAiApi.Response> response = openAiApi.responseEntity(request);

    // Extract text from response
    String text = response.getBody()
        .output()
        .stream()
        .filter(item -> "message".equals(item.type()))
        .flatMap(item -> item.content().stream())
        .filter(content -> "output_text".equals(content.type()))
        .map(OpenAiApi.Response.ContentItem::text)
        .findFirst()
        .orElse(null);
}

Streaming Request

var request = new OpenAiApi.ResponseRequest("Tell me a story", "gpt-4o", true);
Flux<OpenAiApi.ResponseStreamEvent> stream = openAiApi.responseStream(request);

stream.subscribe(event -> {
    if ("response.output_text.delta".equals(event.type())) {
        System.out.print(event.delta());
    }
});

Using Built-in Web Search Tool

var webSearchTool = Map.of("type", "web_search");

var request = new OpenAiApi.ResponseRequest(
    "gpt-4o",
    "What is the current weather in San Francisco?",
    null, null, null, null, null,
    List.of(webSearchTool),  // Enable web_search tool
    null, null, false, null, null, null, null, null, null,
    List.of("web_search_call.action.sources"),  // Include search sources
    null, null, null, null, null, null
);

ResponseEntity<OpenAiApi.Response> response = openAiApi.responseEntity(request);

Multi-turn Conversation

// First request
var request1 = new OpenAiApi.ResponseRequest("What is 2+2?", "gpt-4o");
var response1 = openAiApi.responseEntity(request1);
String responseId = response1.getBody().id();

// Follow-up request
var request2 = new OpenAiApi.ResponseRequest(
    "gpt-4o",
    "And what is that number multiplied by 3?",
    null, null, null, null, null, null, null, null,
    false, null, null, null,
    responseId,  // Reference previous response
    null, null, null, null, null, null, null, null, null
);

var response2 = openAiApi.responseEntity(request2);

Design Decisions

Why Low-Level API Only?

The Responses API is fundamentally different from traditional chat models:

Stateful vs Stateless: The Responses API is designed for stateful, multi-turn agent applications, while ChatModel is stateless
Built-in Tools: Responses API provides native tools (web search, file search, etc.) without custom implementation, unlike ChatModel's function calling
Different Abstractions: The output structure (output array with multiple item types) doesn't map cleanly to ChatResponse
Agent-First Design: Represents a new category of agentic applications rather than a traditional chat interface
Future Evolution: OpenAI is positioning this as the future of agent development, separate from chat completions

Implementation Patterns

Follows Existing Conventions: Mirrors chatCompletionEntity and chatCompletionStream patterns
Comprehensive DTOs: All major API fields included for maximum flexibility
Convenience Constructors: Simplified constructors for common use cases
Type Safety: Uses Java records for immutable, type-safe DTOs
Spring Boot Integration: Full support for externalized configuration

Backward Compatibility

✅ Fully backward compatible

No changes to existing ChatModel implementations
No changes to existing Chat Completions API usage
New functionality is additive only
Default values match OpenAI standards

Testing

Unit Tests

✅ 5 unit tests in ResponsesApiTest
✅ All existing tests continue to pass
✅ No compilation errors

Integration Tests

✅ 4 new integration tests in OpenAiApiIT
✅ Cover synchronous, streaming, configuration, and built-in tools
✅ Minimal cost (~$0.0002-$0.0005 per run)
✅ Serve as usage examples

Build Verification

✅ spring-ai-openai module builds successfully
✅ spring-ai-autoconfigure-model-openai module builds successfully
✅ All existing tests pass

Benefits

Early Access: Enables developers to use OpenAI's latest agentic capabilities
Built-in Tools: Simplifies integration with web search, file search, etc.
Future-Ready: Positions Spring AI for OpenAI's agent-first direction
Flexible: Low-level API allows custom abstractions to be built on top
Well-Documented: Comprehensive tests serve as usage examples
Cost-Efficient: Integration tests designed to minimize API costs

Future Enhancements

Potential future additions (not in this PR):

Higher-level abstractions if patterns emerge
Conversation management utilities
Response accumulator helpers for streaming
Observability support for Responses API calls
Integration with Spring AI's advisor pattern (if applicable)

References

OpenAI Responses API Documentation
OpenAI Migration Guide
Responses vs Chat Completions
OpenAI Java SDK - Referenced for implementation patterns

Note: This PR intentionally does not integrate the Responses API with the high-level ChatModel abstractions. The Responses API represents a fundamentally different paradigm (stateful agents vs stateless chat) that doesn't fit the existing abstractions. This low-level API access allows the Spring AI community to experiment and potentially develop appropriate higher-level abstractions in the future.

@Autowired

## Overview This PR adds support for OpenAI's new **Responses API** to the `OpenAiApi` class, providing low-level access to OpenAI's latest agentic capabilities. The Responses API represents OpenAI's unified interface for building agent-like applications with built-in tools, multi-turn conversations, and enhanced reasoning capabilities. **Important**: This PR adds support at the **low-level API layer only** (`OpenAiApi` class). It does not integrate with the high-level `ChatModel` abstractions. The Responses API appears to be a stateful, standalone application (OpenAI's latest agentic attempt) rather than a traditional chat model. It doesn't fit the existing `ChatModel` abstractions and isn't easily integrated as another chat-model provider. It represents a new agentic category entirely. ## Related Issues - Closes spring-projects#4221 - Support for OpenAI Responses API - Related to spring-projects#2962 - Enhanced reasoning model support - Related to spring-projects#3022 - Multi-turn conversation handling ## Changes ### 1. Core API Support (`OpenAiApi.java`) #### Added DTOs **Request DTO - `ResponseRequest`**: - Comprehensive request object with 24 parameters - Parameters include: `model`, `input`, `instructions`, `temperature`, `tools`, `reasoning`, `conversation`, `previousResponseId`, etc. - Supports all Responses API features: reasoning models, built-in tools, structured outputs, multi-turn conversations - Includes nested records: `TextConfig`, `TextFormat`, `ReasoningConfig` **Response DTO - `Response`**: - Complete response structure with `id`, `status`, `model`, `output`, `usage`, etc. - Nested records: `OutputItem`, `ContentItem`, `ReasoningDetails`, `ResponseError`, `IncompleteDetails` - Supports multiple output types: messages, reasoning, tool calls **Streaming DTO - `ResponseStreamEvent`**: - Event-based streaming support - Includes: `type`, `sequenceNumber`, `response`, `delta`, `text`, etc. - Enables real-time processing of responses #### Added Methods - `responseEntity(ResponseRequest)` - Synchronous response creation - `responseEntity(ResponseRequest, HttpHeaders)` - Synchronous with custom headers - `responseStream(ResponseRequest)` - Streaming response creation - `responseStream(ResponseRequest, HttpHeaders)` - Streaming with custom headers #### Added Configuration - `responsesPath` field (default: `/v1/responses`) - Builder support for responses path configuration - Updated constructors to include responses path ### 2. Autoconfiguration Support #### `OpenAiChatProperties.java` - Added `responsesPath` property with default value `/v1/responses` - Added getter/setter methods - Follows same pattern as `completionsPath` and `embeddingsPath` #### `OpenAiChatAutoConfiguration.java` - Updated `openAiApi()` bean to include `.responsesPath(chatProperties.getResponsesPath())` - Enables Spring Boot property configuration #### `OpenAiEmbeddingAutoConfiguration.java` - Updated `openAiApi()` method to include responses path - Uses default constant for consistency ### 3. Integration Tests (`OpenAiApiIT.java`) Added 4 comprehensive integration tests: 1. **`responseEntity()`** - Basic synchronous response - Tests simple request/response flow - Validates response structure and content - Cost: ~10-20 tokens 2. **`responseStream()`** - Streaming responses - Tests event stream processing - Validates multiple event types - Cost: ~10-20 tokens 3. **`responseWithInstructionsAndConfiguration()`** - Advanced configuration - Tests system instructions and parameters - Validates parameter echo and content accuracy - Cost: ~10-20 tokens 4. **`responseWithWebSearchTool()`** - Built-in web_search tool - Demonstrates built-in tool usage (no custom implementation needed) - Tests tool execution and response handling - Validates output structure with tool calls - Cost: ~30-50 tokens **Total estimated cost**: ~$0.0002 - $0.0005 per test run ### 4. Unit Tests (`ResponsesApiTest.java`) Added comprehensive unit tests covering: - `ResponseRequest` creation with various parameter combinations - `Response` structure validation - `ResponseStreamEvent` structure validation - Convenience constructors ### 5. Documentation Updates #### `openai-chat.adoc` - Added `spring.ai.openai.chat.responses-path` property documentation - Updated Chat Completions API references for clarity - Changed note about Responses API availability (now supported via `OpenAiApi`) ## Key Features ### Built-in Tools The Responses API provides tools without custom implementation: - **`web_search`** - Search the internet (demonstrated in integration test) - **`file_search`** - Search through uploaded files - **`code_interpreter`** - Execute Python code - **`computer_use`** - Interact with computer interfaces - Remote MCPs (Model Context Protocol) ### Multi-turn Conversations Native support for stateful conversations: - Via `previousResponseId` parameter - Via `conversation` object/ID ### Reasoning Models Enhanced support for reasoning models (gpt-5, o-series): - Configurable reasoning effort levels - Access to reasoning content and summaries ### Structured Outputs JSON schema validation via `TextConfig`: - Type-safe structured responses - Schema validation with `strict` mode ## Configuration ### Default Configuration (Minimal) ```yaml spring: ai: openai: api-key: ${OPENAI_API_KEY} ``` ### Custom Configuration ```yaml spring: ai: openai: api-key: ${OPENAI_API_KEY} chat: responses-path: /v1/responses # Can be customized for compatible servers ``` ## Usage Examples ### Basic Synchronous Request ```java @Autowired private OpenAiApi openAiApi; public void example() { var request = new OpenAiApi.ResponseRequest("What is AI?", "gpt-4o"); ResponseEntity<OpenAiApi.Response> response = openAiApi.responseEntity(request); // Extract text from response String text = response.getBody() .output() .stream() .filter(item -> "message".equals(item.type())) .flatMap(item -> item.content().stream()) .filter(content -> "output_text".equals(content.type())) .map(OpenAiApi.Response.ContentItem::text) .findFirst() .orElse(null); } ``` ### Streaming Request ```java var request = new OpenAiApi.ResponseRequest("Tell me a story", "gpt-4o", true); Flux<OpenAiApi.ResponseStreamEvent> stream = openAiApi.responseStream(request); stream.subscribe(event -> { if ("response.output_text.delta".equals(event.type())) { System.out.print(event.delta()); } }); ``` ### Using Built-in Web Search Tool ```java var webSearchTool = Map.of("type", "web_search"); var request = new OpenAiApi.ResponseRequest( "gpt-4o", "What is the current weather in San Francisco?", null, null, null, null, null, List.of(webSearchTool), // Enable web_search tool null, null, false, null, null, null, null, null, null, List.of("web_search_call.action.sources"), // Include search sources null, null, null, null, null, null ); ResponseEntity<OpenAiApi.Response> response = openAiApi.responseEntity(request); ``` ### Multi-turn Conversation ```java // First request var request1 = new OpenAiApi.ResponseRequest("What is 2+2?", "gpt-4o"); var response1 = openAiApi.responseEntity(request1); String responseId = response1.getBody().id(); // Follow-up request var request2 = new OpenAiApi.ResponseRequest( "gpt-4o", "And what is that number multiplied by 3?", null, null, null, null, null, null, null, null, false, null, null, null, responseId, // Reference previous response null, null, null, null, null, null, null, null, null ); var response2 = openAiApi.responseEntity(request2); ``` ## Design Decisions ### Why Low-Level API Only? The Responses API is fundamentally different from traditional chat models: 1. **Stateful vs Stateless**: The Responses API is designed for stateful, multi-turn agent applications, while `ChatModel` is stateless 2. **Built-in Tools**: Responses API provides native tools (web search, file search, etc.) without custom implementation, unlike `ChatModel`'s function calling 3. **Different Abstractions**: The output structure (`output` array with multiple item types) doesn't map cleanly to `ChatResponse` 4. **Agent-First Design**: Represents a new category of agentic applications rather than a traditional chat interface 5. **Future Evolution**: OpenAI is positioning this as the future of agent development, separate from chat completions ### Implementation Patterns 1. **Follows Existing Conventions**: Mirrors `chatCompletionEntity` and `chatCompletionStream` patterns 2. **Comprehensive DTOs**: All major API fields included for maximum flexibility 3. **Convenience Constructors**: Simplified constructors for common use cases 4. **Type Safety**: Uses Java records for immutable, type-safe DTOs 5. **Spring Boot Integration**: Full support for externalized configuration ## Backward Compatibility ✅ **Fully backward compatible** - No changes to existing `ChatModel` implementations - No changes to existing Chat Completions API usage - New functionality is additive only - Default values match OpenAI standards ## Testing ### Unit Tests - ✅ 5 unit tests in `ResponsesApiTest` - ✅ All existing tests continue to pass - ✅ No compilation errors ### Integration Tests - ✅ 4 new integration tests in `OpenAiApiIT` - ✅ Cover synchronous, streaming, configuration, and built-in tools - ✅ Minimal cost (~$0.0002-$0.0005 per run) - ✅ Serve as usage examples ### Build Verification - ✅ `spring-ai-openai` module builds successfully - ✅ `spring-ai-autoconfigure-model-openai` module builds successfully - ✅ All existing tests pass ## Benefits 1. **Early Access**: Enables developers to use OpenAI's latest agentic capabilities 2. **Built-in Tools**: Simplifies integration with web search, file search, etc. 3. **Future-Ready**: Positions Spring AI for OpenAI's agent-first direction 4. **Flexible**: Low-level API allows custom abstractions to be built on top 5. **Well-Documented**: Comprehensive tests serve as usage examples 6. **Cost-Efficient**: Integration tests designed to minimize API costs ## Future Enhancements Potential future additions (not in this PR): 1. Higher-level abstractions if patterns emerge 2. Conversation management utilities 3. Response accumulator helpers for streaming 4. Observability support for Responses API calls 5. Integration with Spring AI's advisor pattern (if applicable) ## Migration from Chat Completions For users wanting to try the Responses API: | Chat Completions | Responses API | |------------------|---------------| | `messages` array | `input` (simplified) | | Custom function implementation | Built-in tools (no code needed) | | Manual conversation state | Native multi-turn support | | Limited reasoning access | Full reasoning capabilities | | `ChatCompletionRequest` | `ResponseRequest` | ## References - [OpenAI Responses API Documentation](https://platform.openai.com/docs/api-reference/responses) - [OpenAI Migration Guide](https://platform.openai.com/docs/guides/migrate-to-responses) - [Responses vs Chat Completions](https://platform.openai.com/docs/guides/responses-vs-chat-completions) - [OpenAI Java SDK](https://github.com/openai/openai-java) - Referenced for implementation patterns ## Checklist - [x] Code compiles without errors - [x] All existing tests pass - [x] New unit tests added and passing - [x] New integration tests added and passing - [x] Documentation updated - [x] Autoconfiguration support added - [x] Spring Boot properties supported - [x] Backward compatible - [x] Follows existing code conventions - [x] No breaking changes --- **Note**: This PR intentionally does **not** integrate the Responses API with the high-level `ChatModel` abstractions. The Responses API represents a fundamentally different paradigm (stateful agents vs stateless chat) that doesn't fit the existing abstractions. This low-level API access allows the Spring AI community to experiment and potentially develop appropriate higher-level abstractions in the future. Signed-off-by: Dmitry Bedrin <[email protected]>

filiphr · 2025-12-05T11:46:33Z

Note: This PR intentionally does not integrate the Responses API with the high-level ChatModel abstractions. The Responses API represents a fundamentally different paradigm (stateful agents vs stateless chat) that doesn't fit the existing abstractions. This low-level API access allows the Spring AI community to experiment and potentially develop appropriate higher-level abstractions in the future.

Why are you saying this @bedrin according to https://platform.openai.com/docs/guides/migrate-to-responses OpenAI themselves are recommending new projects to use the new Responses API

While Chat Completions remains supported, Responses is recommended for all new projects.

Some models e.g. gpt-5.1-codex do not support the Chat Completions API and only support the Responses API. I believe that there can be a ChatModel abstraction build on top of the Responses API.

This PR is a step in a good direction. However, I do believe that Spring AI itself should support the Responses API through the ChatModel abstraction.

bedrin · 2025-12-05T23:44:16Z

@filiphr I meant that we cannot use all functionality provided by new Responses API in existing Spring AI ChatModel abstraction. But indeed we can still use it - you're right. But I think it should be a separate PR.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add OpenAI Responses API Support #5037

Add OpenAI Responses API Support #5037

Uh oh!

bedrin commented Dec 4, 2025 •

edited

Loading

Uh oh!

filiphr commented Dec 5, 2025

Uh oh!

bedrin commented Dec 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add OpenAI Responses API Support #5037

Are you sure you want to change the base?

Add OpenAI Responses API Support #5037

Uh oh!

Conversation

bedrin commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Related Issues

Changes

1. Core API Support (OpenAiApi.java)

Added DTOs

Added Methods

Added Configuration

2. Autoconfiguration Support

OpenAiChatProperties.java

OpenAiChatAutoConfiguration.java

OpenAiEmbeddingAutoConfiguration.java

3. Integration Tests (OpenAiApiIT.java)

4. Unit Tests (ResponsesApiTest.java)

5. Documentation Updates

openai-chat.adoc

Key Features

Built-in Tools

Multi-turn Conversations

Reasoning Models

Structured Outputs

Configuration

Default Configuration (Minimal)

Custom Configuration

Usage Examples

Basic Synchronous Request

Streaming Request

Using Built-in Web Search Tool

Multi-turn Conversation

Design Decisions

Why Low-Level API Only?

Implementation Patterns

Backward Compatibility

Testing

Unit Tests

Integration Tests

Build Verification

Benefits

Future Enhancements

References

Uh oh!

filiphr commented Dec 5, 2025

Uh oh!

bedrin commented Dec 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bedrin commented Dec 4, 2025 •

edited

Loading

1. Core API Support (`OpenAiApi.java`)

`OpenAiChatProperties.java`

`OpenAiChatAutoConfiguration.java`

`OpenAiEmbeddingAutoConfiguration.java`

3. Integration Tests (`OpenAiApiIT.java`)

4. Unit Tests (`ResponsesApiTest.java`)

`openai-chat.adoc`