-
Notifications
You must be signed in to change notification settings - Fork 25
Closed
Description
The pain with the current Assistants API
The current Assistants API is not just slow, but also unreliable. The black box nature of the API making it hard to know what's happening, I create a Run and have no idea what it's doing. Is it searching the vector store? Is it generating text? Is it stuck? You can't see inside and polling is vague. This makes life so hard, as everything looks correct and yet it still fails but then after a couple of retries it starts to work again. This happens at random times so we can't be sure when it happens.
Why to use the new Responses API ?
- It supports both stateful and stateless runs.
- Also supports tool calling just like assistants api
- Supports streaming of messages as well
- Better state management
- Instead of complicated Thread logic. The new way is just passing a previous_response_id
- instead of "create a run, poll it forever". You make one API call.
- OpenAI is shutting the Assistants API down in 2026.
NOTE:
- We are not going to ditch the Assistant API entirely right now, but are going to focus on using it first for support LLM
- For user facing AI support system, streaming the AI response while it's generating the answer is much better UX and it also feels way faster.
Metadata
Metadata
Assignees
Labels
No labels