Skip to content

feat(speech): integrate TTS/ASR cloud vendors into speech registry (task 21)#1255

Merged
doroteaMonaco merged 7 commits intomofa-org:mainfrom
Nixxx19:feat/task-21-speech-registry
Mar 16, 2026
Merged

feat(speech): integrate TTS/ASR cloud vendors into speech registry (task 21)#1255
doroteaMonaco merged 7 commits intomofa-org:mainfrom
Nixxx19:feat/task-21-speech-registry

Conversation

@Nixxx19
Copy link
Contributor

@Nixxx19 Nixxx19 commented Mar 15, 2026

Summary

Wire OpenAI, ElevenLabs, and Deepgram cloud speech adapters into the MoFA framework so that callers can configure and register them through a single, config-driven API surface rather than constructing each adapter manually.

Related Issues

Closes #1256 -- Integrate TTS/ASR from 2-3 Cloud Vendors

Context

The vendor adapters (OpenAI TTS/ASR, ElevenLabs TTS, Deepgram ASR) already existed in mofa-integrations and already implemented the TtsAdapter / AsrAdapter kernel traits. The gap was:

  1. No SDK re-exports -- callers had to depend directly on mofa-integrations
  2. No config-driven factory -- callers had to construct every adapter manually
  3. No single function to populate a SpeechAdapterRegistry from a config block

Changes

crates/mofa-sdk/Cargo.toml

  • Added mofa-integrations as an optional dependency
  • Added three forwarded feature flags: openai-speech, elevenlabs, deepgram

crates/mofa-sdk/src/lib.rs -- new pub mod speech

  • Re-exports kernel speech traits (TtsAdapter, AsrAdapter, AudioFormat, TtsConfig, AsrConfig, TtsOutput, AsrResult, VoiceInfo)
  • Re-exports foundation types (SpeechAdapterRegistry, VoicePipeline, VoicePipelineConfig, VoicePipelineResult)
  • Re-exports config types (SpeechConfig, SpeechProviderConfig) under all three feature flags
  • Per-feature adapter re-exports (OpenAiTtsAdapter, OpenAiAsrAdapter, ElevenLabsTtsAdapter, DeepgramAsrAdapter)
  • register_speech_adapters(registry, config) -- feature-gated factory that constructs and registers the correct adapter per provider entry

crates/mofa-integrations/src/speech/registry_builder.rs (new file)

  • SpeechProviderConfig -- per-vendor config: provider, api_key, default_tts, default_asr
  • SpeechConfig -- top-level config holding a list of provider entries
  • Builder methods: new(), with_provider(), as_default_tts(), as_default_asr()
  • Full serde support with #[serde(default)] on booleans for forward-compatible deserialization
  • Internal unit tests (4 tests)

crates/mofa-integrations/src/speech/mod.rs

  • Unconditionally exports the new registry_builder module

crates/mofa-integrations/tests/speech_example_tests.rs

  • Added mod registry_builder_tests (4 tests): empty config, builder chain, JSON roundtrip, missing default flags

Testing

All tests pass without network access:

# Build with all speech features
cargo build -p mofa-sdk --features openai-speech,elevenlabs,deepgram

# Run registry builder tests (no network required)
cargo test -p mofa-integrations \
    --features openai-speech,elevenlabs,deepgram \
    -- speech

# Results: 12 passed, 3 ignored (live-API tests skipped in CI)

Live-API tests are marked #[ignore] and require vendor API keys via environment variables:

OPENAI_API_KEY=sk-... cargo test -p mofa-integrations \
    --features openai-speech -- --ignored

Breaking Changes

None. All new API surface is additive. Existing code that does not enable the new feature flags is unaffected.

Checklist

  • All new public enums are #[non_exhaustive]
  • Feature flags are correctly gated with optional = true
  • No circular dependencies introduced (factory in sdk, config types in integrations)
  • Serde roundtrip tests added
  • Live-API tests remain #[ignore]d for CI

@Nixxx19 Nixxx19 changed the title feat(speech): integrate TTS/ASR cloud vendors into speech registry feat(speech): integrate TTS/ASR cloud vendors into speech registry (task 21) Mar 15, 2026
@doroteaMonaco doroteaMonaco merged commit c226e1a into mofa-org:main Mar 16, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Integrate TTS/ASR from 2-3 Cloud Vendors into MoFA Speech Registry (task 21)

2 participants