feat(voice/typescript): `random_interruptions` demo — transport upgrade for real mid-stream audio cut-off

## Context

Follow-up from #561. The structural interrupt-scheduling fix landed: `voiceProceed({ interruptions })` now schedules AGENT pre-step + fires the barge-in via inline TTS, honoring `delayRange`. The `user_interrupt` event fires with `outcome === "fired_after_speech"`, the cut-off-boundary `transcriptTruncated` label fires correctly, and the unit test (`proceed-interrupt.test.ts`) locks it in.

But the **bundled Pipecat stub bot** in `python/examples/voice/_bot/bot.py` generates TTS in a burst (~50 ms of wall time for a several-second reply) and streams faster than realtime. By the time `adapter.interrupt()` runs (even with `delayRange: [0.5, 2.0]`), the bot has already sent all frames. The `random_interruptions` demo recording ends up 10 s with 4 segments — the structural fix is correct but the bot can't *demonstrate* real audio truncation.

Additionally, the real `judgeAgent` fires inside `voiceProceed` and concludes `success=false` after one truncated agent turn (can't satisfy criteria), collapsing the demo to 4 segments regardless.

## Options for the follow-up

1. **Re-target `random_interruptions` to a realtime-streaming transport** (OpenAI Realtime or Gemini Live). They support real server-side cancel that prevents late-frame delivery. Loses TS-Python parity for this specific demo (Python uses Pipecat).
2. **Modify the bundled Pipecat bot to stream TTS at realtime pace** — requires Python edits + changes the reference implementation.
3. **Suppress `judgeAgent` during `voiceProceed`** — let proceed exhaust all `turns` before judge fires. Helps both demos.

Recommend (3) first (cheap, helps multiple demos), then (1) for the audio-cut demonstration if still needed.

## Current state at #561 merge

- `random-interruptions.test.ts` assertions encode what the Pipecat bot CAN prove (interrupt fires + canned-phrase strategy + `fired_after_speech` outcome + truncation label + recovery + multi-turn). The "median-shorter" assertion (added then dropped) is intentionally omitted — see commit `0b9dd1e`.
- The recording (`javascript/recordings/random_interruptions/full.wav`, 10 s, 4 segments) is honest about the bot's limit but thin as a demo.
- `gemini_live_interruption` already demonstrates real mid-stream audio cut-off on a realtime transport.

## Acceptance

A `random_interruptions` recording that shows: 30 s+ duration, 5+ segments, agent says >= 1.5 s of substantive audio before each barge-in, agent recovers with non-empty audio after barge-ins, multi-turn conversation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(voice/typescript): `random_interruptions` demo — transport upgrade for real mid-stream audio cut-off #583

Context

Options for the follow-up

Current state at #561 merge

Acceptance

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

feat(voice/typescript): random_interruptions demo — transport upgrade for real mid-stream audio cut-off #583

Description

Context

Options for the follow-up

Current state at #561 merge

Acceptance

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

feat(voice/typescript): `random_interruptions` demo — transport upgrade for real mid-stream audio cut-off #583