Feature/streaming responses #23

W3NDO · 2025-03-19T15:16:32Z

returns {:error, "streaming"} when evauating the rag triad of a streaming request.
added keyword() in the @spec for generate_response and the @callback for generate_text
Update the function clauses to consider the new opts argument.
added 2 tests.
- A test in rag/generation_test.exs that confirms a stream response is returned.
- A test in rag/evaluation_test.exs that ensures an error is returned when attempting to evaluate a streaming request

- Updated the Rag.Generation.t type, so that the response field can be a stream/enumerable

@callback

- returns `{:error, "streaming"}` when evauating the rag triad of a streaming request. - added `keyword()` in the @SPEC for `generate_response` and the @callback for `generate_text` - Update the function clauses to consider the new `opts` argument. - added 2 tests. - A test in `rag/generation_test.exs` that confirms a stream response is returned. - A test in `rag/evaluation_test.exs` that ensures an error is returned when attempting to evaluate a streaming request

- Updated the test to raise the Streaming error.

W3NDO · 2025-03-20T07:39:36Z

@joelpaulkoch I think this is ready for a review so we can see if I am heading in the right direction. I am unsure about the evaluation test and what other behaviours I may need to update for the Rag.Ai.Provider

joelpaulkoch · 2025-03-21T10:31:31Z

@W3NDO just want to let you know that I'll have a look at it tomorrow, thank you!

W3NDO · 2025-03-21T10:56:02Z

@joelpaulkoch Sounds good. I should be available to make changes.

joelpaulkoch

This goes very much in the right direction, thank you!

I left some comments and we still have to implement the streaming in the providers, Rag.Ai.Nx, Rag.Ai.OpenAI, Rag.Ai.Cohere.

Just let me know if you have further questions or need help with something.

.gitignore

.tool-versions

lib/rag/ai/provider.ex

lib/rag/evaluation.ex

lib/rag/generation.ex

- removed `.tool-versions` - updated code with changes requested. - working on implementing streaming in the providers `Rag.Ai.Nx`, `Rag.Ai.OpenAI` and `Rag.Ai.Cohere`

- removed .tool-versions locally and in the repo

W3NDO · 2025-03-31T16:28:15Z

@joelpaulkoch
I wanted to ask if you could explain how we would go about doing this for Rag.Ai.Nx for example?

I left some comments and we still have to implement the streaming in the providers, Rag.Ai.Nx, Rag.Ai.OpenAI, Rag.Ai.Cohere.

What I was thinking is that in the Rag.Ai.Nx it would be an extension of generate_text/3. However, I got stuck on how to write a test for it. Should we also just raise Streaming. If this is the case, would it mean that we do something similar for Rag.Ai.Cohere and Rag.Ai.OpenAI.

joelpaulkoch · 2025-04-06T16:01:42Z

Sorry, took me a while to respond.
For Rag.Ai.Cohere and Rag.Ai.OpenAI we can add a stream: true option in generate_text/3 and then pass "stream" => true to the API.
In the case of Rag.Ai.Nx streaming must be configured when creating the serving.
So, there is nothing we can do with options, maybe we should raise and say "streaming must be configured when creating the serving" when someone passes a stream option?
We must still handle the stream inside Rag.Ai.Nx, at the moment this is the code in generate_text/3:

  def generate_text(%__MODULE__{} = provider, prompt, _opts) when is_binary(prompt) do
    %{results: [result]} =
      Nx.Serving.batched_run(provider.text_serving, prompt)

    {:ok, result.text}
  rescue
    error ->
      {:error, error}
  end

When we're streaming we would receive only the stream from Nx.Serving.batched_run/2.
So we would have to implement it somehow like this:

  def generate_text(%__MODULE__{} = provider, prompt, _opts) when is_binary(prompt) do
     case Nx.Serving.batched_run(provider.text_serving, prompt) do
            %{results: [result]} -> {:ok, result.text}
           stream -> {:ok, stream}
     end
  rescue
    error ->
      {:error, error}
  end

Regarding testing, at the moment we mock Nx.Serving.batched_run/2 in test/rag/generation/nx_test.exs:

      expect(Nx.Serving, :batched_run, fn _serving, prompt ->
        assert prompt == "a prompt"
        %{results: [%{text: "a response"}]}
      end)

We could do the same in a test for streaming but return a stream instead:

      expect(Nx.Serving, :batched_run, fn _serving, prompt ->
        assert prompt == "a prompt"
        Stream.take_while(["this", "is", "a", "stream"], &(&1))
      end)

W3NDO · 2025-04-14T15:08:56Z

@joelpaulkoch ,I will start working on this this week.

- updating so I can work on this branch on a diferent host. Remember to squash this.y

- http streaming is failing - need help with the tests

W3NDO · 2025-05-01T10:56:18Z

@joelpaulkoch sorry this took a while. I ran into some issues with testing the streaming options for Cohere and OpenAI. Further I could not figure out how to get the nx test working.

lib/rag/ai/cohere.ex

lib/rag/ai/openai.ex

test/rag/generation/http_test.exs

W3NDO · 2025-05-01T11:06:47Z

@joelpaulkoch we have 3 tests failing that I am unsure how to make pass. I have explained the issues I may be facing.

joelpaulkoch · 2025-05-10T13:35:17Z

Thank you a lot! I will have another look at the complete PR soon but I think it's basically ready to merge.

W3NDO added 3 commits March 19, 2025 13:23

feat(feature/streaming_responses)

2d192f7

- Updated the Rag.Generation.t type, so that the response field can be a stream/enumerable

fix(feature/streaming_responses)

742a0fb

- Updated the test to raise the Streaming error.

joelpaulkoch reviewed Mar 22, 2025

View reviewed changes

W3NDO added 2 commits March 31, 2025 19:16

fix(feature/streaming_responses)

e5cf298

- removed `.tool-versions` - updated code with changes requested. - working on implementing streaming in the providers `Rag.Ai.Nx`, `Rag.Ai.OpenAI` and `Rag.Ai.Cohere`

chore(feature/streaming_responses)

dcd0f32

- removed .tool-versions locally and in the repo

W3NDO added 2 commits May 1, 2025 13:05

feat-incomplete(feature/streaming_responses)

ebef9a8

- updating so I can work on this branch on a diferent host. Remember to squash this.y

broken(feature/streaming_responses)

e0fd485

- http streaming is failing - need help with the tests

W3NDO commented May 1, 2025

View reviewed changes

lib/rag/ai/cohere.ex Show resolved Hide resolved

W3NDO commented May 1, 2025

View reviewed changes

lib/rag/ai/cohere.ex Show resolved Hide resolved

W3NDO commented May 1, 2025

View reviewed changes

lib/rag/ai/openai.ex Show resolved Hide resolved

W3NDO commented May 1, 2025

View reviewed changes

test/rag/generation/http_test.exs Outdated Show resolved Hide resolved

joelpaulkoch added 6 commits May 8, 2025 18:46

bring back newline at the end

04b8a74

stream with cohere

632d757

stream with openai

a7e00bb

drive-by: assert in tests

b4b5ec9

stream with nx

39c5ea8

raise in evaluation when streaming

c84405a

joelpaulkoch added 3 commits May 12, 2025 13:41

Merge branch 'main' into feature/streaming_responses

63ac629

stream with ollama

e16d7b0

clean up test

9f931a3

update typespecs

8ad3245

joelpaulkoch merged commit 0460925 into bitcrowd:main May 12, 2025
3 checks passed

Feature/streaming responses #23

Feature/streaming responses #23

Uh oh!

Conversation

W3NDO commented Mar 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

W3NDO commented Mar 20, 2025

Uh oh!

joelpaulkoch commented Mar 21, 2025

Uh oh!

W3NDO commented Mar 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joelpaulkoch left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

W3NDO commented Mar 31, 2025

Uh oh!

joelpaulkoch commented Apr 6, 2025

Uh oh!

W3NDO commented Apr 14, 2025

Uh oh!

W3NDO commented May 1, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

W3NDO commented May 1, 2025

Uh oh!

joelpaulkoch commented May 10, 2025

Uh oh!

Uh oh!

Uh oh!

W3NDO commented Mar 19, 2025 •

edited

Loading

W3NDO commented Mar 21, 2025 •

edited

Loading