Skip to content

chore(examples/voice/#486): retire legacy gpt-4o-audio-preview surface, migrate supported audio examples to gpt-audio-mini#612

Merged
drewdrewthis merged 7 commits into
mainfrom
fix/486-delete-deprecated-voice-test
Jun 11, 2026
Merged

chore(examples/voice/#486): retire legacy gpt-4o-audio-preview surface, migrate supported audio examples to gpt-audio-mini#612
drewdrewthis merged 7 commits into
mainfrom
fix/486-delete-deprecated-voice-test

Conversation

@drewdrewthis

@drewdrewthis drewdrewthis commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator

What

Cohesive retirement of the legacy gpt-4o-audio-preview voice/audio example surface, consolidating three threads into one PR:

Why we don't unskip (the key correction)

#486 framed the goal as "remove the skipif(CI) markers and restore CI coverage." That premise was never achievable: these audio/voice example tests are live end-to-end tests — they call real OpenAI (gpt-audio-mini) and the real LangWatch backend, incur cost, and produce non-deterministic audio. They are correctly CI-skipped regardless of model (same class as the other live voice/* example tests). The dead gpt-4o-audio-preview model was only the historical reason for the skip; swapping it doesn't make these CI-runnable.

So the right end-state is migrated + intentionally CI-skipped, not "unskipped." We migrate the model so the examples work when run live/locally, keep the skip markers, and fix the stale skip comments to state the real reason.

What changed

Genuinely-dead → retired:

  • Tombstone docs/docs/pages/examples/multimodal/voice-to-voice.mdx and testing-voice-agents.mdx → a short pointer to /voice/getting-started. The pages' URLs still return 200 (the vocs fork has no redirect layer, so a tombstone avoids 404s on previously-public URLs).
  • Delete the now-unused LegacyVoiceDeprecation.mdx snippet (zero importers after the edits).
  • test_voice_to_voice_conversation.py deleted (it was explicitly DEPRECATED — the legacy single-call pattern).

Supported → kept + migrated to gpt-audio-mini:

  • audio-to-text.mdx / audio-to-audio.mdx kept and updated (prereq prose now names gpt-audio-mini; LegacyVoiceDeprecation banner removed) — these document the current supported single-call pattern, not a legacy one.
  • TS example tests (multimodal-audio-to-text, multimodal-audio-to-audio, multimodal-voice-to-voice-conversation, helpers/openai-voice-agent.ts) migrated to gpt-audio-mini with updated skip-comments (via fix(examples/voice): swap deleted gpt-4o-audio-preview → gpt-audio-mini #607's cherry-picked commits).
  • Python test_audio_to_text.py / test_audio_to_audio.py skip-comments rewritten to "live-E2E, not model"; skipif(CI) markers retained (no model-literal change — they route through the helper's gpt-audio-mini default).
  • _generated example partials regenerated to match the migrated test sources.
  • overview.mdx voice-agents link repointed to /voice/getting-started.

Verification

  • pnpm build (in docs/) exits 0, no broken-link/missing-import errors.
  • Tombstone routes build and render the /voice/getting-started pointer (confirmed in dist/examples/multimodal/voice-to-voice/index.html).
  • audio-to-text built page references gpt-audio-mini (x6); no live gpt-4o-audio-preview model literal remains anywhere (grep over model=/model: is empty — remaining mentions are explanatory comments/tombstone prose only).
  • Python/TS audio tests retain their CI-skip markers (live E2E by design).

Closes / supersedes

🤖 Generated with Claude Code

@github-actions github-actions Bot added the low-risk-change PR qualifies as low-risk per policy and can be merged without manual review label Jun 4, 2026
github-actions[bot]
github-actions Bot previously approved these changes Jun 4, 2026

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved by automation: PR qualifies as low-risk-change under the documented policy.

@drewdrewthis drewdrewthis added the grinding Grinder is actively managing this PR label Jun 4, 2026
@github-actions github-actions Bot removed the low-risk-change PR qualifies as low-risk per policy and can be merged without manual review label Jun 4, 2026
@drewdrewthis

Copy link
Copy Markdown
Collaborator Author

[grinder] READY for human review

CI: green (zero failing, zero pending)
ACs: met — deleted `test_voice_to_voice_conversation.py` (self-annotated DEPRECATED), removed its MDX import/tab, removed its `voice-integration.yml` entry, updated manifest; Closes #486 (closing link confirmed by GitHub)
Threads: zero unresolved, zero outdated

Verified by:
`gh pr checks 612` → all 17 checks pass/skip, zero pending/failing
`gh api graphql reviewThreads` → `nodes: []` (zero threads)
`gh api graphql closingIssuesReferences` → `nodes: [{number: 486}]`
`python/tests/voice/test_feature_file_contract.py` contract counts updated (127 scenarios, 79/13/35 tag split) via cherry-pick of 6ea8b8d — `test (3.12)` passes

@drewdrewthis

Copy link
Copy Markdown
Collaborator Author

✅ Review + prove-it: READY (after closing-ref correction)

Review: deleting test_voice_to_voice_conversation.py is correct — its own docstring marked it DEPRECATED ("legacy gpt-4o-audio-preview single-call pattern"), it pinned the deleted model, and the capability is covered by the VoiceAgentAdapter demos in python/examples/voice/ (30+ files) + the TS multimodal-voice-to-voice-conversation.test.ts. No coverage regression.

Prove-it:

  • Collection clean: uv run pytest examples/ --collect-only → no import errors from the deletion (the 3 errors are pre-existing Missing OPENAI_API_KEY on unrelated remote-agent SSE tests).
  • grep -rn test_voice_to_voice_conversation python/ → zero dangling refs.

Fixed before ready: the body said Closes #486, but #486 is a 7-file unskip issue and this PR (+#607+#610) addresses only 2; five files (test_audio_to_audio.py, test_audio_to_text.py, 3 JS voice tests) still carry the dead-model skip. Corrected the body → Part of #486, and posted the full file-by-file status on #486 so it stays open until all seven are green. (If GitHub's cached link still auto-closes #486 on merge, reopen it.)

Minor: carries a disclosed cherry-picked contract-count fix to test_feature_file_contract.py (borrowed from #610) — may trivially conflict if #610 lands first.

Comment thread docs/docs/pages/examples/multimodal/voice-to-voice.mdx Outdated
rogeriochaves
rogeriochaves previously approved these changes Jun 5, 2026
drewdrewthis and others added 6 commits June 5, 2026 14:10
…gacy test

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…t file

Companion to the delete commit: `test_voice_to_voice_conversation.py` was
removed but two references remained:
- docs/scripts/mdx-examples-manifest.js: remove sourcePath entry
- .github/workflows/voice-integration.yml: remove from pytest command

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ython example

The deleted `test_voice_to_voice_conversation.py` was referenced in
`docs/docs/pages/examples/multimodal/voice-to-voice.mdx` as:
- a generated MDX import (breaking the docs build)
- a Python LanguageTabs.CodeTab (now empty)
- a prose GitHub link in "Complete Sources"

Remove the import, the Python tab, and update the prose link to point to
the helper utilities instead with a note about the legacy pattern removal.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The voice-to-voice example helper and the audio-to-text example pinned
`gpt-4o-audio-preview`, which OpenAI has removed (404 model_not_found
since 2026-05-19). Any user running the canonical voice example hit an
immediate 404.

Switch to `gpt-audio-mini` — OpenAI's current cost-efficient GA
audio-chat model — matching the Python twin, which already migrated
(python/scenario/config/voice_models.py:44 OPENAI_AUDIO_CHAT_MODEL,
python/examples/test_audio_to_text.py:157). Verified live: gpt-audio-mini
accepts the identical chat.completions shape (modalities:["text","audio"],
audio:{voice,format}) and returns audio. Re-ran the voice-to-voice e2e
against prod LangWatch — success: true, real 2-turn conversation, traces
landed (project_bZspxwkhCD4POvqmIgOr2).

SDK core was unaffected (OpenAIRealtimeAgentAdapter uses gpt-realtime-mini).
This closes a py↔ts example-parity gap left by #561.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… refs after model swap

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…e, keep+migrate supported audio examples

Cohesive retirement of the legacy gpt-4o-audio-preview voice/audio example
surface, folding in the model swap from #607 (cherry-picked) and superseding
the unskip plan in #486.

Genuinely-dead (retired):
- Tombstone docs/docs/pages/examples/multimodal/voice-to-voice.mdx and
  testing-voice-agents.mdx -> pointer to /voice/getting-started (URLs still
  200; the langwatch vocs fork has no redirect layer, so a tombstone is how
  we avoid 404s on previously-public URLs).
- Delete the now-unused LegacyVoiceDeprecation.mdx snippet (no importers left).
- (test_voice_to_voice_conversation.py already deleted in an earlier #486 commit.)

Supported (kept + migrated to gpt-audio-mini):
- audio-to-text.mdx / audio-to-audio.mdx kept and updated: prereq prose now
  names gpt-audio-mini; LegacyVoiceDeprecation banner removed (these document
  the CURRENT supported single-call pattern, not a legacy one).
- Python test_audio_to_text.py / test_audio_to_audio.py: skip COMMENTS rewritten
  to the real reason (live E2E -- real OpenAI gpt-audio-mini + LangWatch backend,
  cost, non-deterministic audio); skipif(CI) markers retained by design. No model
  literal change (they route through the helper's gpt-audio-mini default).
- _generated example partials regenerated to match the migrated test sources.

overview.mdx voice-agents link repointed to /voice/getting-started.

Why the audio tests stay CI-skipped: they are live end-to-end tests; #486's
"unskip to restore CI coverage" premise was never achievable (cost +
non-determinism). The right end-state is migrated-and-intentionally-skipped.

Docs build: pnpm build exits 0, no broken-link/missing-import errors; tombstone
routes render the /voice/getting-started pointer.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@drewdrewthis drewdrewthis force-pushed the fix/486-delete-deprecated-voice-test branch from 4acb9db to 98d970a Compare June 5, 2026 12:22
@drewdrewthis drewdrewthis changed the title chore(examples/voice/#486): delete deprecated voice-to-voice legacy test chore(examples/voice/#486): retire legacy gpt-4o-audio-preview surface, migrate supported audio examples to gpt-audio-mini Jun 5, 2026
@drewdrewthis

drewdrewthis commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator Author

Review verdict: READY

Re-reviewed at HEAD 6256f4d (was NOT-READY at 98d970a). All three blocking findings from the prior pass are resolved and verified on the branch; CI is green (python-complete + javascript-complete both SUCCESS — the two required checks). Verified each fix directly against the branch this session, not from the author report.

Resolved since last review

  1. [principles][hygiene] Spec ↔ code contradictionspecs/voice-docs-surface.feature rewritten: the retirement scenario group now asserts the tombstone-and-delete reality. AC18 → "the legacy source file is deleted from the repo" (+ canonical demos under python/examples/voice/); AC17 → "the shared LegacyVoiceDeprecation.mdx snippet no longer exists, the pointer living directly in each page"; AC15 → tombstones still resolve 200 (no 404) and the two audio pages stay live/migrated to gpt-audio-mini. Coherent Gherkin, full-read verified.
  2. [hygiene] Dangling tombstone linkdocs/docs/pages/examples/multimodal/multimodal-images.mdx:173 repointed to /voice/getting-started; grep confirms it was the only remaining un-repointed link.
  3. [principles] Asymmetric retirement — the JS voice-to-voice twin is retired symmetrically: multimodal-voice-to-voice-conversation.test.ts + its mdx-examples-manifest.js entry + the _generated partial all deleted (manifest now has 0 voice-to-voice refs; nothing else imported the partial).

Non-blocking

  1. [test] Dead helper exports — RESOLVED: save_conversation_audio + concatenate_wav_files removed (grep-confirmed no other consumer); encode_audio_to_base64 correctly kept (still used by 2 tests).
  2. [test] Dead CI step (voice-integration.yml:129) + brittle judge-criteria parity — pre-existing, intentionally out of scope; tracked as separate New-Issue follow-ups.

Evidence

  • CI on 6256f4d: python-complete SUCCESS, javascript-complete SUCCESS.
  • Docs pnpm build exit 0; both tombstones render /voice/getting-started; multimodal-images.html shows 0 stale links; pytest --co 880 collected (the FIX4 trim broke no imports).

… symmetric JS twin retirement, drop dead helper exports

FIX 1 [blocker]: rewrite specs/voice-docs-surface.feature deprecation group
to assert the retirement reality, not the old keep-and-banner strategy.
- AC15: reframed to "tombstoned pages still 200, point to /voice/getting-started"
  and clarifies the supported audio pages stay live+migrated (not tombstoned).
- AC17: shared LegacyVoiceDeprecation.mdx snippet is gone; rewritten to an
  inline per-page tombstone pointer.
- AC18: flipped from "source file is not deleted" to assert the file IS deleted
  and canonical demos live at python/examples/voice/*.
Background + AC Coverage Map updated to match.

FIX 2 [blocker]: repoint dangling tombstone link in
docs/docs/pages/examples/multimodal/multimodal-images.mdx from
./testing-voice-agents to /voice/getting-started (mirrors audio-to-*.mdx).

FIX 3 [blocker]: symmetric retirement of the JS voice-to-voice twin — delete
the test, its mdx-examples-manifest.js entry, and the orphaned generated mdx.
Confirmed no other page imports the generated mdx.

FIX 4 [non-blocker]: drop dead helper exports save_conversation_audio +
concatenate_wav_files (deleted test was their only consumer; grep shows no
other importer) from helpers/__init__.py and their defs in audio_helpers.py.
encode_audio_to_base64 is kept (still used by audio-to-audio/text examples).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown
Contributor

Automated low-risk assessment

This PR was evaluated against the repository's Low-Risk Pull Requests procedure and does not qualify as low risk.

This PR modifies files in restricted directories that require manual review per policy.

This PR requires a manual review before merging.

@drewdrewthis drewdrewthis merged commit 1ebdd1c into main Jun 11, 2026
21 checks passed
@drewdrewthis drewdrewthis deleted the fix/486-delete-deprecated-voice-test branch June 11, 2026 09:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-ready slack-requested Slack PR review request posted

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ci: migrate voice-agent example tests off deleted gpt-4o-audio-preview, then unskip

2 participants