enhance(daily): richer report content + progressive delivery (streaming-edit or batched)

## Summary

Day-one enhancement plan for `/daily`. The current report is functional after #421 closed the GitHub-403 root cause, but the content is thin and the delivery is one-shot — the user gets a single Lark message after the agent finishes, with no progress indication and no per-source breakdown. Two related areas to fix:

1. **Content depth** — the skill prompt at `agents/Aevatar.GAgents.ChannelRuntime/AgentBuilderTemplates.cs:48-64` only suggests three GitHub queries (commits authored, issues authored, issues commented) and asks for "3-6 concise bullet points + one blocker line". This produces something like a TL;DR, not a daily.
2. **Delivery UX** — `SkillRunnerGAgent.SendOutputAsync` (`agents/Aevatar.GAgents.ChannelRuntime/SkillRunnerGAgent.cs:254-300`) sends the entire LLM output in one `POST /open-apis/im/v1/messages` call. There is no streaming-edit equivalent of normal chat (`ChannelLlmReplyInboxRuntime` → `TurnStreamingReplySink` → `/channel-relay/reply/update`). The user sees nothing for ~30s, then a wall of text.

## Why current streaming doesn't transfer to SkillRunner

`TurnStreamingReplySink` works against the **reply token** baked into the inbound webhook payload — that token has a ~14-minute TTL and is bound to one specific user-initiated turn. Scheduled SkillRunner runs (the daily 9am cron, the manual `/run-agent`, retries) don't have a reply token; they're ambient. Trying to reuse `/channel-relay/reply/update` would land as `reply_token_missing_or_expired`.

The closest equivalent for SkillRunner is **edit-own-message** via Lark's `PATCH /open-apis/im/v1/messages/{message_id}` (text and card update endpoints already exist — the channel-runtime adapter uses them today for chat streaming). That gives SkillRunner the same "watch the message grow" UX without depending on a reply-token grant.

## Proposed scope

### A. Content depth (skill prompt + suggested data sources)

Rewrite `TryBuildDailyReportSpec` so the daily covers more surface area, with explicit structured sections and a length budget per section. Treat the new prompt as a **specification of what to fetch + how to summarize**, not a freeform creative brief.

Suggested sections (each with a hard ≤N-line budget, omitted entirely if empty rather than padded):

- **Shipped** — PRs merged in last 24h (title + repo + #PR), commits to default branch
- **In flight** — open PRs authored, with stale-flag (>24h since last activity)
- **Reviews** — PRs reviewed (approve/request-changes/comment counts), review comments left
- **Issues** — issues opened, closed, commented on
- **CI status** — recent failing builds on default branch (if any)
- **Trend vs yesterday** — comparing today's counts against the prior 24h window
- **Blockers** — auto-detected: PRs >24h waiting for review, CI red >2h, GitHub `blocked`/`needs-info` labels

GitHub queries to suggest (replacing the current 3):

```
GET /search/issues?q=author:{u}+is:pr+is:merged+merged:>={iso}      # shipped PRs
GET /search/issues?q=author:{u}+is:pr+is:open                       # in flight
GET /search/issues?q=reviewed-by:{u}+updated:>={iso}                # reviews
GET /search/issues?q=author:{u}+is:issue+created:>={iso}            # issues opened
GET /search/issues?q=author:{u}+is:issue+is:closed+closed:>={iso}   # issues closed
GET /repos/{owner}/{repo}/actions/runs?branch=main&per_page=10      # CI on each tracked repo
```

When the user provides `repositories=…`, the prompt must instruct the LLM to make these calls **once per repo** rather than collapsing into a global search — the global `/search/*` endpoints don't filter to a repo allowlist cleanly.

Future scope (out for the first cut, but the prompt should leave room for):

- Calendar provider via NyxID (`api-google-calendar`?) — meetings attended/upcoming
- Notion / Linear / Jira if the user has them connected — pages edited, tickets moved
- Slack/Lark message highlights via NyxID-bridge

### B. Progressive delivery

Two real options, both implementable in this codebase:

**Option 1 — batched (cheap, ships fast)**
SkillRunner sends one Lark message per section as it produces them. Implementation: change `SendOutputAsync` to consume *multiple* outputs from the LLM turn (LLM emits a structured envelope: `{section_id, header, body}` per `tool_call` boundary, or use stop-sequence sectioning). For each section, send a fresh `POST /im/v1/messages`.

Pros: simplest. Pros: each section is an atomic artifact in the chat history.
Cons: chat clutter; user sees N notifications.

**Option 2 — streaming-edit (richer UX, more work)**
SkillRunner sends an initial placeholder Lark message and progressively edits it via `PATCH /open-apis/im/v1/messages/{message_id}`. Reuses the same `TurnStreamingReplySink` pattern but bound to the freshly-sent message_id instead of a reply token.

Concrete steps:
1. New `SkillRunnerStreamingReplySink` in `agents/Aevatar.GAgents.ChannelRuntime/` modeled after `TurnStreamingReplySink` but driven by `NyxIdApiClient.UpdateChannelRelayTextReplyAsync`-equivalent calls against `s/api-lark-bot/open-apis/im/v1/messages/{id}` (existing edit endpoint, already wrapped by `ChannelConversationTurnRunner` for normal chat — confirm it works without reply token).
2. `SkillRunnerGAgent.HandleTriggerAsync` plumbs the sink through to `LLMService` so the `OnDelta` callbacks land on edit-the-message instead of buffer-and-send-once.
3. Throttle edits the same way `TurnStreamingReplySink` does (Lark has its own rate limit on edits, default 5 edits/sec; the existing throttle is conservative enough).
4. `FinalizeAsync` does one final edit with the complete text — same shape as the existing sink.

Pros: matches normal chat UX (one message that grows). Pros: no chat clutter.
Cons: more code; need to verify Lark message-edit doesn't have a "max edits" limit per message (rate limit yes, total count probably unlimited for text but worth confirming for cards).

**Recommendation**: ship Option 2. The infrastructure for edit-message is already in `NyxIdApiClient` (used by normal chat streaming), and the UX is what users actually expect when they trigger a multi-source report. Option 1 is a fallback if something blocks Option 2 mid-implementation.

### C. Cross-cutting / safety net

- **Failure-notification path** is currently broken under cross-tenant Lark setups (`SkillRunnerGAgent:407` `TrySendFailureAsync` goes through the same `s/api-lark-bot` proxy that just rejected with 99992364, so the user never sees the failure either). Either reuse the inbound-webhook reply token for failure notification when it's still in TTL, or store a recent-channel-bot fallback.
- **Content boundary** — the LLM should not invent activity when sources are empty. Current prompt says "say so plainly", but with 7+ sections it's tempting for the model to pad. Add a per-section "if zero results, omit the section entirely; if everything is empty, send 'No measurable activity in the last 24h.'" instruction.
- **Length cap** — Lark text messages have a body size limit (around 30KB). With richer content + multi-repo + multi-source, we can blow past it. Implement chunked delivery for the text path (split on section boundaries, send N messages) before any of the above ships, or route through cards (which have their own block-count limit but no body limit).

## Acceptance

- [ ] Daily prompt rewritten with structured sections, repo-aware query suggestions, and "omit empty sections" guidance
- [ ] At least one of Option 1 or Option 2 lands; Option 2 preferred
- [ ] Failure-notification path no longer dies silently when outbound proxy 4xx's
- [ ] A new test under `AgentBuilderToolTests` (or a new file) pins the structured prompt's "omit empty section" instruction so future copy edits don't regress it
- [ ] Smoke test on a real GitHub user where commits authored is empty but PRs reviewed is non-empty — the report skips the Shipped section and renders Reviews

## Out of scope for this issue

- Calendar / Notion / Linear / Jira integrations (need NyxID provider work upstream)
- Multi-language daily reports
- Per-user customization (which sections to include, length preferences) — would belong on the agent config, not the template

## Related

- #411 — original `/daily` runtime failure (closed by #421)
- #421 — User-Agent fix that makes daily actually return GitHub data
- ChronoAIProject/NyxID#514 — proxy default User-Agent (upstream)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enhance(daily): richer report content + progressive delivery (streaming-edit or batched) #423

Summary

Why current streaming doesn't transfer to SkillRunner

Proposed scope

A. Content depth (skill prompt + suggested data sources)

B. Progressive delivery

C. Cross-cutting / safety net

Acceptance

Out of scope for this issue

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

enhance(daily): richer report content + progressive delivery (streaming-edit or batched) #423

Description

Summary

Why current streaming doesn't transfer to SkillRunner

Proposed scope

A. Content depth (skill prompt + suggested data sources)

B. Progressive delivery

C. Cross-cutting / safety net

Acceptance

Out of scope for this issue

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions