feat: add MiniMax Cloud TTS as third voiceover provider by octo-patch · Pull Request #11 · digitalsamba/claude-code-video-toolkit

octo-patch · 2026-03-30T22:46:48Z

Summary

Adds MiniMax Cloud TTS as a third voiceover provider alongside ElevenLabs and Qwen3-TTS.

No GPU required — runs entirely via MiniMax cloud API
Two models: speech-2.8-hd (high quality) and speech-2.8-turbo (faster)
12 built-in voices: 5 English + 7 Chinese
Full integration with voiceover.py (--provider minimax) and standalone minimax_tts.py
Brand config support via voice.json minimax section

Usage

# Standalone
python tools/minimax_tts.py --text "Hello world" --voice English_Graceful_Lady --output hello.mp3
python tools/minimax_tts.py --list-voices

# Via voiceover.py (single file or per-scene)
python tools/voiceover.py --provider minimax --script script.md --output out.mp3
python tools/voiceover.py --provider minimax --minimax-voice English_Persuasive_Man --scene-dir scenes/ --json

Files Changed (8 files, ~1159 additions)

File	Change
`tools/minimax_tts.py`	New standalone MiniMax TTS tool
`tools/voiceover.py`	Add `minimax` to `--provider` choices, MiniMax CLI options, generation dispatch
`tools/config.py`	Add `get_minimax_api_key()` helper
`brands/default/voice.json`	Add `minimax` config section
`README.md`	Document MiniMax TTS usage
`CLAUDE.md`	Document MiniMax TTS standalone tool
`tests/test_minimax_tts.py`	28 unit tests
`tests/test_minimax_tts_integration.py`	3 integration tests

Test Plan

28 unit tests pass (mocked API, CLI parsing, dry-run, brand config, payload format)
3 integration tests pass (real API calls: hd model, turbo model, voiceover.py integration)
Existing ElevenLabs and Qwen3-TTS providers unaffected
Verify --list-voices output
Test per-scene mode with MiniMax provider

Add MiniMax Cloud TTS (speech-2.8-hd / speech-2.8-turbo) as a third voiceover provider alongside ElevenLabs and Qwen3-TTS. MiniMax offers 12 built-in voices (5 English + 7 Chinese), no GPU required — runs entirely via cloud API. Changes: - tools/minimax_tts.py: standalone MiniMax TTS tool with --list-voices, --model hd/turbo, --voice, --speed, --volume, --pitch options - tools/voiceover.py: add --provider minimax with --minimax-voice, --minimax-model, --volume, --pitch options; works in both single-file and per-scene modes - tools/config.py: add get_minimax_api_key() helper - brands/default/voice.json: add minimax config section - README.md, CLAUDE.md: document MiniMax TTS usage - tests/test_minimax_tts.py: 28 unit tests - tests/test_minimax_tts_integration.py: 3 integration tests

ConalMullan

Thanks so much for this contribution — really well done! You've clearly studied the codebase and followed the existing patterns closely. The test coverage is solid and the voiceover.py integration is clean. I'm keen to try MiniMax out once this is merged — still on the lookout for the best TTS provider so this is great timing.

A couple of things to address before merging:

1. `--volume` and `--pitch` should be namespaced (medium)

These are added as top-level args in voiceover.py, but they're MiniMax-specific. The Qwen3 args use --speaker, --tone, etc. — for consistency and to avoid future collisions, these should be --minimax-volume and --minimax-pitch.

2. Missing `toolkit-registry.json` entry (gap)

Per our architecture, new tools should be added to _internal/toolkit-registry.json — that's the canonical catalog for all tools, skills, components, etc. CLAUDE.md and README are updated (nice!), but the registry needs an entry too. See the existing entries for qwen3_tts or sadtalker as examples.

Minor notes (non-blocking)

No input validation on speed/volume/pitch ranges — the docstring says speed is 0.5–2.0, volume 0.1–10.0, pitch -12 to 12, but values pass straight through to the API. This is consistent with how the other tools work, so not a blocker — just noting it.
Brand config default detection — checking args.minimax_voice == "English_Graceful_Lady" to detect "user didn't set it" is the same pattern as Qwen3's args.speaker == "Ryan" check, so it's fine for now.

Thanks again — looking forward to the update! 🎙️

ConalMullan requested changes Apr 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add MiniMax Cloud TTS as third voiceover provider#11

feat: add MiniMax Cloud TTS as third voiceover provider#11
octo-patch wants to merge 1 commit intodigitalsamba:mainfrom
octo-patch:feature/add-minimax-tts-provider

octo-patch commented Mar 30, 2026

Uh oh!

ConalMullan left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

octo-patch commented Mar 30, 2026

Summary

Usage

Files Changed (8 files, ~1159 additions)

Test Plan

Uh oh!

ConalMullan left a comment

Choose a reason for hiding this comment

1. --volume and --pitch should be namespaced (medium)

2. Missing toolkit-registry.json entry (gap)

Minor notes (non-blocking)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. `--volume` and `--pitch` should be namespaced (medium)

2. Missing `toolkit-registry.json` entry (gap)