Skip to content

The Assistants API is tooo slow and mysterious, let's switch to the new Responses API #261

@satti-hari-krishna-reddy

Description

The pain with the current Assistants API

The current Assistants API is not just slow, but also unreliable. The black box nature of the API making it hard to know what's happening, I create a Run and have no idea what it's doing. Is it searching the vector store? Is it generating text? Is it stuck? You can't see inside and polling is vague. This makes life so hard, as everything looks correct and yet it still fails but then after a couple of retries it starts to work again. This happens at random times so we can't be sure when it happens.

Why to use the new Responses API ?

  1. It supports both stateful and stateless runs.
  2. Also supports tool calling just like assistants api
  3. Supports streaming of messages as well
  4. Better state management
  5. Instead of complicated Thread logic. The new way is just passing a previous_response_id
  6. instead of "create a run, poll it forever". You make one API call.
  7. OpenAI is shutting the Assistants API down in 2026.

NOTE:

  • We are not going to ditch the Assistant API entirely right now, but are going to focus on using it first for support LLM
  • For user facing AI support system, streaming the AI response while it's generating the answer is much better UX and it also feels way faster.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions