fix(messaging): harden proactive broadcast parity#510
fix(messaging): harden proactive broadcast parity#510vsumner wants to merge 4 commits intocodex/cron-runtime-observabilityfrom
Conversation
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/messaging/mattermost.rs (1)
1008-1018:⚠️ Potential issue | 🟠 MajorValidate Mattermost target IDs before they enter the classified-send path.
dm:{user_id}targets are sent straight intoget_or_create_dm_channel(), and direct channel targets are only validated later insidecreate_post(). That means obviously malformed IDs can still surface throughmark_classified_broadcast, so the cron retry loop may retry permanently bad targets as if they were send failures. CallSelf::validate_id()on both target shapes here and map those failures withmark_permanent_broadcast.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/messaging/mattermost.rs` around lines 1008 - 1018, Validate Mattermost target IDs before resolving DMs or passing them down the classified-send path: call Self::validate_id(user_id) and map any Err to crate::messaging::traits::mark_permanent_broadcast before calling get_or_create_dm_channel(user_id) in the dm:{user_id} branch, and likewise call Self::validate_id(target) (mapping errors to mark_permanent_broadcast) for direct channel targets instead of deferring validation to create_post(); keep the rest of the logic (get_or_create_dm_channel, resolved_target usage) intact.src/messaging/slack.rs (1)
1148-1160:⚠️ Potential issue | 🟠 MajorDon't retry a partially delivered chunked Slack broadcast.
After the first
chat_post_messagesucceeds, this call has already delivered part of the broadcast. If a later chunk gets classified as transient here, cron will retry from chunk 1 and duplicate the chunks Slack already accepted. After the first successful post, this path should return a non-retryable/partial-delivery error instead.Possible fix
OutboundResponse::Text(text) => { + let mut sent_any = false; for chunk in split_message(&text, 12_000) { let mut req = SlackApiChatPostMessageRequest::new( channel_id.clone(), markdown_content(chunk), ); req = req.opt_thread_ts(thread_ts.clone()); - session - .chat_post_message(&req) - .await - .context("failed to broadcast slack message") - .map_err(crate::messaging::traits::mark_classified_broadcast)?; + match session + .chat_post_message(&req) + .await + .context("failed to broadcast slack message") + { + Ok(_) => sent_any = true, + Err(error) => { + return Err(if sent_any { + crate::messaging::traits::mark_permanent_broadcast(error) + } else { + crate::messaging::traits::mark_classified_broadcast(error) + }); + } + } } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/messaging/slack.rs` around lines 1148 - 1160, The loop in OutboundResponse::Text uses split_message and calls session.chat_post_message for each chunk; if an earlier chunk succeeds but a later chunk fails and is classified transient via crate::messaging::traits::mark_classified_broadcast, the whole broadcast will be retried from chunk 1 and duplicate delivered chunks. Fix by tracking whether any chunk has already been successfully posted (e.g., a boolean like sent_any_chunk) and, when session.chat_post_message returns an error for a later chunk, convert transient-classified errors into a non-retryable partial-delivery error (create or reuse a variant meaning "partial delivery — do not retry") instead of letting mark_classified_broadcast cause a full retry; update the mapping around session.chat_post_message (the SlackApiChatPostMessageRequest + session.chat_post_message call) to check sent_any_chunk and return the non-retryable/partial-delivery error when appropriate.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/messaging/email.rs`:
- Around line 520-524: The predicate ensure_supported_broadcast_response
currently treats any RichMessage as supported even though the email sender only
uses response.text; update the check or the send path so RichMessage is only
accepted when it has a usable plaintext fallback: either (A) tighten
supports_email_broadcast_response/ensure_supported_broadcast_response to require
that a RichMessage contains a non-empty text/plain fallback (e.g.,
response.text.is_some_and(|t| !t.is_empty())), or (B) when sending in the
broadcast arm that handles RichMessage, derive a plaintext body from the rich
payload (extract card titles, bodies, choice labels, or poll question into a
single text fallback) and use that for the email body; change the predicate
and/or the broadcast send logic (references:
ensure_supported_broadcast_response, RichMessage, response,
supports_email_broadcast_response) so a card-only message cannot pass without
producing non-empty text.
In `@src/messaging/signal.rs`:
- Around line 953-955: The error currently includes the raw Signal target in the
permanent-failure message; change the call to
crate::messaging::traits::mark_permanent_broadcast inside the same branch to
avoid leaking the target by removing the `{target}` interpolation and instead
use a generic message (e.g. "invalid signal broadcast target format" or "invalid
signal broadcast target: redacted") or explicitly redact the value before
including it; update the anyhow::anyhow! invocation accordingly so
mark_permanent_broadcast and any logs no longer contain the raw phone/UUID/group
identifier.
In `@src/messaging/telegram.rs`:
- Around line 547-556: The predicate passed to
crate::messaging::traits::ensure_supported_broadcast_response currently accepts
every OutboundResponse::RichMessage even though the Telegram send path (the
message-sending logic that only emits text and an optional poll) drops cards and
interactive_elements; update the check in ensure_supported_broadcast_response to
either (a) only accept RichMessage variants that contain a plain text fallback
(e.g., require a non-empty body/text field) and no unsupported fields, or (b)
explicitly restrict acceptance to the subset Telegram can degrade to (e.g.,
RichMessage with text +/- poll) and reject RichMessage values containing cards
or interactive_elements unless you synthesize/derive a text fallback first;
locate and update the predicate around ensure_supported_broadcast_response and
the code that builds the outgoing Telegram payload so both agree on the
supported subset (OutboundResponse::RichMessage with text-only or
text-plus-poll).
In `@src/messaging/twitch.rs`:
- Around line 473-490: The Text and RichMessage branches call client.say() in a
loop over split_message(...) and currently map all failures through
crate::messaging::traits::mark_classified_broadcast which treats them as
retryable broadcast errors; change the logic to track whether any prior chunk
succeeded (e.g., a boolean flag) and if a later chunk fails after at least one
success, return a non-retryable/partial-delivery error instead of a full
retryable broadcast error — update the error mapping on the client.say() calls
in the OutboundResponse::Text and OutboundResponse::RichMessage handling (the
loops that iterate over split_message(&text, MAX_MESSAGE_LENGTH)) so that: if no
chunks have succeeded, keep the original error mapping, but if some chunks
already succeeded map/convert the error to a non-retryable/partial-delivery
classification (use your project’s existing partial/non-retryable error helper)
and return immediately to avoid re-sending earlier chunks.
In `@src/messaging/webchat.rs`:
- Line 77: The code uses extract_webchat_broadcast_text(response) which returns
RichMessage.text and ignores card-derived fallback, letting card-only broadcasts
be treated as success with empty payload; update the places that build
ApiEvent::OutboundMessage (where text is sourced from
extract_webchat_broadcast_text) to derive a plaintext fallback using
OutboundResponse::text_from_cards(&response) when RichMessage.text is empty (or
otherwise reject/return an error for card-only RichMessage), and similarly
adjust other call sites that construct ApiEvent::OutboundMessage so they use
OutboundResponse::text_from_cards or validate RichMessage card-only payloads
before reporting delivery; reference functions/types:
extract_webchat_broadcast_text, OutboundResponse::text_from_cards,
ApiEvent::OutboundMessage, RichMessage.
---
Outside diff comments:
In `@src/messaging/mattermost.rs`:
- Around line 1008-1018: Validate Mattermost target IDs before resolving DMs or
passing them down the classified-send path: call Self::validate_id(user_id) and
map any Err to crate::messaging::traits::mark_permanent_broadcast before calling
get_or_create_dm_channel(user_id) in the dm:{user_id} branch, and likewise call
Self::validate_id(target) (mapping errors to mark_permanent_broadcast) for
direct channel targets instead of deferring validation to create_post(); keep
the rest of the logic (get_or_create_dm_channel, resolved_target usage) intact.
In `@src/messaging/slack.rs`:
- Around line 1148-1160: The loop in OutboundResponse::Text uses split_message
and calls session.chat_post_message for each chunk; if an earlier chunk succeeds
but a later chunk fails and is classified transient via
crate::messaging::traits::mark_classified_broadcast, the whole broadcast will be
retried from chunk 1 and duplicate delivered chunks. Fix by tracking whether any
chunk has already been successfully posted (e.g., a boolean like sent_any_chunk)
and, when session.chat_post_message returns an error for a later chunk, convert
transient-classified errors into a non-retryable partial-delivery error (create
or reuse a variant meaning "partial delivery — do not retry") instead of letting
mark_classified_broadcast cause a full retry; update the mapping around
session.chat_post_message (the SlackApiChatPostMessageRequest +
session.chat_post_message call) to check sent_any_chunk and return the
non-retryable/partial-delivery error when appropriate.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 847a9f49-7a39-4b47-96f2-48f5645b8a9b
📒 Files selected for processing (8)
src/messaging/discord.rssrc/messaging/email.rssrc/messaging/mattermost.rssrc/messaging/signal.rssrc/messaging/slack.rssrc/messaging/telegram.rssrc/messaging/twitch.rssrc/messaging/webchat.rs
249a250 to
9690040
Compare
Review Comment Resolution (Post-Rebase)The following CodeRabbit/Tembo comments have been addressed in the rebased branch: ✅ Fixed
All 6 comments reference outdated code from before the rebase onto No action required - comments can be resolved/dismissed. |
fae2bf3 to
2c3e953
Compare
9690040 to
e690e5e
Compare
Why
broadcast_proactive()retries the entireOutboundResponse, not individual platform sends. That means chunking adapters have to distinguish between "nothing was delivered yet" and "part of this message already landed". If they do not, a transient failure on chunk 2 can cause cron to replay chunk 1 on retry and duplicate already-delivered output.This PR keeps the scope adapter-focused. It does not redesign the messaging trait or retry owner; it makes the existing proactive delivery path truthful and non-duplicating for the reviewed adapters.
What Changed
portaladapter rename drift after the rebase by removing stalewebchathelper/test references and making portal fallback/error reporting consistentTesting
cargo test messaging::traits --libcargo test messaging::manager --libcargo test messaging::portal --libcargo test messaging::slack --libcargo test messaging::discord --libcargo test messaging::mattermost --libcargo test messaging::twitch --libjust preflightjust gate-prNotes
MessagingManager::broadcast_proactive()or theMessagingtrait.Follow-up hardening related to #502.