Skip to content

Defer throttled streaming chunks via pending-flush timer#407

Merged
eanzhao merged 4 commits intodevfrom
feat/2026-04-25_streaming-sink-pending-flush
Apr 25, 2026
Merged

Defer throttled streaming chunks via pending-flush timer#407
eanzhao merged 4 commits intodevfrom
feat/2026-04-25_streaming-sink-pending-flush

Conversation

@eanzhao
Copy link
Copy Markdown
Contributor

@eanzhao eanzhao commented Apr 25, 2026

Summary

TurnStreamingReplySink previously dropped any delta that arrived inside the throttle window until the next delta showed up. When the LLM stalled near the window edge (cold-start, router handoff, tail-of-thought), the user saw the reply freeze for the rest of the throttle interval before the next batch landed.

This PR:

  • Stashes the latest accumulated text on each in-window delta and arms a one-shot deferred flush timer that publishes the stash when the window closes. Multiple in-window deltas collapse onto the latest text.
  • Reflushes after in-flight dispatch (openclaw FlushController-style): while a dispatch is in flight, additional deltas — caller-driven or timer-driven — are stashed; the dispatch loop reflushes the most recent _pendingText after the in-flight chunk completes. The conversation actor still observes strict edit ordering.
  • Makes the sink IDisposable so ChannelLlmReplyInboxRuntime cleans up an unfired timer on paths that skip FinalizeAsync (interactive reply, empty reply, exception). Wired via a using declaration.
  • Adopts Microsoft.Extensions.TimeProvider.Testing.FakeTimeProvider for deterministic timer-driven test control. The hand-rolled FakeTimeProvider in TurnStreamingReplySinkTests is replaced with the BCL-aligned one.

Background: studied https://github.com/ColinLu50/openclaw-lark-streamsrc/card/flush-controller.ts (mutex + reflush-on-conflict) and streaming-card-controller.ts (deferred flush after long gap). This PR adopts the deferred-flush + reflush patterns; it does NOT adopt the OpenClaw CardKit 2.0 streaming card path (we stay on /reply + /reply/update edit-message via NyxID relay).

Out of scope

Followed up by #405:

  • Replace Disabled + SuppressInterim boolean pair on NyxRelayStreamingState with an explicit phase enum + PHASE_TRANSITIONS table.
  • Centralize message-unavailable / recall short-circuit on ConversationGAgent so future handlers don't have to duplicate the checks.

Test plan

  • dotnet build sink test project — 0 errors.
  • TurnStreamingReplySinkTests — 12/12 pass (7 retained, 5 new: deferred timer flush, finalize-cancels-timer, dedup-against-last-emitted, dispose-prevents-flush, dispose-idempotent).
  • Full Aevatar.GAgents.ChannelRuntime.Tests — 166/166 pass.
  • Aevatar.GAgents.Channel.Protocol.Tests — 119/119 pass.
  • tools/ci/test_stability_guards.sh — passed (no new Task.Delay / WaitUntilAsync; uses TimeProvider.CreateTimer + FakeTimeProvider.Advance).
  • CI: Channel slice + Distributed slice.

🤖 Generated with Claude Code

Previously a delta that arrived inside the throttle window was silently
dropped until the next delta showed up; when the LLM stalled near the
window edge (cold-start, router handoff, tail-of-thought) the user saw
the reply freeze for the rest of the interval before the next batch
landed.

Stash the latest accumulated text on each in-window delta and arm a
one-shot deferred flush timer that publishes the stash when the window
closes. Multiple in-window deltas collapse onto the latest text. While
a dispatch is in flight, additional deltas (caller-driven or
timer-driven) are stashed and the dispatch loop reflushes the most
recent _pendingText after the in-flight chunk completes
(reflush-on-conflict), so the conversation actor still observes strict
edit ordering. Make the sink IDisposable so ChannelLlmReplyInboxRuntime
cleans up an unfired timer on paths that skip FinalizeAsync (interactive
reply, empty reply, exception).

Adopts Microsoft.Extensions.TimeProvider.Testing.FakeTimeProvider for
deterministic timer-driven test control. The hand-rolled FakeTimeProvider
in TurnStreamingReplySinkTests is replaced with the BCL-aligned one.

Refs #405 for the follow-up phase-state-machine + centralized
unavailable-guard refactor on ConversationGAgent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@eanzhao eanzhao changed the base branch from feat/2026-04-24_lark-streaming-reply to dev April 25, 2026 04:09
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 94409bd382

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +141 to +143
_pendingText = text;
_hasPending = true;
return;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Wait for in-flight dispatch when finalizing stream

When _dispatchInProgress is true, this path only stashes _pendingText and returns immediately, so FinalizeAsync can complete before the final chunk is dispatched. ChannelLlmReplyInboxRuntime.ProcessAsync then sends LlmReplyReadyEvent right after awaiting FinalizeAsync, and if that ready event is handled first, later chunk events are treated as late and dropped in ConversationGAgent.HandleLlmReplyStreamChunkAsync (the processed-command guard). This makes final-stream delivery race-dependent and can skip the intended final edit path.

Useful? React with 👍 / 👎.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 25, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.38%. Comparing base (d102e47) to head (07084b9).
⚠️ Report is 5 commits behind head on dev.

@@            Coverage Diff             @@
##              dev     #407      +/-   ##
==========================================
- Coverage   70.39%   70.38%   -0.01%     
==========================================
  Files        1175     1175              
  Lines       84452    84452              
  Branches    11124    11124              
==========================================
- Hits        59446    59441       -5     
- Misses      20715    20720       +5     
  Partials     4291     4291              
Flag Coverage Δ
ci 70.38% <ø> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 3 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

eanzhao and others added 2 commits April 25, 2026 13:26
`FinalizeAsync` previously stashed the final text and returned immediately
when a dispatch was in flight. The inbox runtime then sent
`LlmReplyReadyEvent` right after, racing the late final chunk past
`ConversationGAgent`'s processed-command guard which would silently drop it.

`TurnStreamingReplySink` now creates a `_drainTcs` when `FinalizeAsync`
stashes onto an in-flight dispatch and awaits it. The dispatch loop signals
that TCS once it fully drains (success, swallowed dispatch failure, or
exception), so `FinalizeAsync` only returns after the final chunk has been
dispatched. `Dispose` also releases any pending waiter so a finalize-then-
dispose path cannot hang.

Adds `FinalizeAsync_DispatchInFlight_WaitsForFinalChunkOnWire` regression
covering the race using a gated mock actor.

Codex review: #407 (comment)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@eanzhao eanzhao merged commit 55ccbaf into dev Apr 25, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant