Skip to content

Lark streaming reply: explicit phase state machine + centralized unavailable guard #405

@eanzhao

Description

@eanzhao

Background

PR #374 (feat/2026-04-24_lark-streaming-reply) split the streaming-failure handling on ConversationGAgent into two flags so we no longer reuse a dead NyxID /reply token after the first chunk lands:

  • Disabled — initial send failed before the reply token was consumed; safe to fall back to a single-shot /reply via RunLlmReplyAsync.
  • SuppressInterim — first chunk consumed the token; only /reply/update is valid; no fallback to /reply allowed.

The fix is correct, but the runtime state is now an ad-hoc combination of two booleans plus a derived ReplyTokenConsumed (boolean over PlatformMessageId). This will keep accreting flags as more failure modes are added (rate-limit, message recall, edit-unsupported, etc.) and the legality of state transitions only lives in the heads of reviewers.

In parallel, "should I skip this callback because the message is already unavailable?" logic is scattered across ConversationGAgent.HandleLlmReplyStreamChunkAsync, TryStreamedLlmReplyFinalizeAsync, and the static-fallback path. New handlers (e.g. tool-call hooks, reasoning hooks if/when added) will have to remember to mirror the same checks.

The OpenClaw Lark plugin (https://github.com/ColinLu50/openclaw-lark-stream) hits the same shape of problem and resolves it with two patterns we should adopt:

  1. An explicit CardPhase state machine with a PHASE_TRANSITIONS map that rejects illegal transitions and records a terminalReason for observability.
  2. A single UnavailableGuard that owns the "should this callback short-circuit" decision, so every entrypoint defers to one method instead of repeating the same checks.

Scope

A. Phase state machine for the per-turn streaming runtime state

Replace the Disabled + SuppressInterim + derived ReplyTokenConsumed shape on NyxRelayStreamingState with an explicit phase enum, e.g.

Idle               // no chunk attempted yet
PlaceholderSent    // first send landed, token consumed
Streaming          // interim edits flowing
SuppressingInterim // post-send interim edit failed; final edit still allowed
DisabledPreSend    // initial send failed before token consumed; /reply fallback allowed
TerminalSucceeded
TerminalPartial    // last flushed text was persisted as the user-visible terminal state

Constraints:

  • Must remain in-memory on the actor (per CLAUDE.md "运行态不持久化"); the dictionary already lives outside State.
  • Define a PhaseTransitions table and reject illegal transitions with a log line at warn level (do NOT throw — actor turns must keep making progress).
  • Capture a TerminalReason on entering any terminal phase for diagnostics.
  • All branches that today read state.Disabled || state.SuppressInterim should be expressed through phase-level helpers (AllowsReplyFallback, AllowsInterimEdit, AllowsFinalEdit).

B. Centralized unavailable-message guard

Introduce a single guard helper on ConversationGAgent (or a small dependency) that owns:

  • "Is the upstream message recalled / deleted / 230099-class error?"
  • "Has this turn already been terminated for unavailability?"
  • "Should this callback source short-circuit?"

Every public handler that touches the streaming path (HandleLlmReplyStreamChunkAsync, TryStreamedLlmReplyFinalizeAsync, future reasoning/tool hooks) should defer to this guard at the top of the method instead of repeating ad-hoc checks. New handlers added in the future then only have to call if (ShouldSkipForUnavailable(\"<source>\")) return;.

Out of scope

  • Switching the outbound path from /reply + /reply/update (edit-message) to Lark CardKit 2.0 streaming cards. That is a separate Lark-only UX track and will be filed separately if/when reasoning/tool visualization is on the roadmap.
  • The TurnStreamingReplySink pending-flush timer / reflush-on-conflict work — handled in a separate PR (sink-only change, Sink doesn't touch the actor's failure state).

References

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions