Status: Future idea, not scheduled. Lives in the v0.5+ ""maybe"" bucket of the roadmap.
Summary
Stream LLM tokens to external consumers while the workflow itself sees only the final response. Pattern: activity streams tokens out-of-band via pub/sub (Redis, NATS, etc.) while the workflow waits on async-completion.
Why this matters
Modern UX expects token-by-token feedback for long-running agents. Currently impossible because streaming conflicts with deterministic replay.
Why not scheduled
Fundamental tension with Temporal's determinism. The async-completion + external pub/sub pattern is the only viable approach, but it requires careful design and a real consumer to validate.
Open questions
- Default pub/sub backend (if any)?
- How to handle partial-response retries?
Status: Future idea, not scheduled. Lives in the v0.5+ ""maybe"" bucket of the roadmap.
Summary
Stream LLM tokens to external consumers while the workflow itself sees only the final response. Pattern: activity streams tokens out-of-band via pub/sub (Redis, NATS, etc.) while the workflow waits on async-completion.
Why this matters
Modern UX expects token-by-token feedback for long-running agents. Currently impossible because streaming conflicts with deterministic replay.
Why not scheduled
Fundamental tension with Temporal's determinism. The async-completion + external pub/sub pattern is the only viable approach, but it requires careful design and a real consumer to validate.
Open questions