Releases: FluidInference/FluidAudio
Releases · FluidInference/FluidAudio
v0.7.10
What's Changed
- Update FluidAudio version to 0.7.9 by @BrandonWeng in #192
- Add word-level timestamps support to CLI transcribe command by @Alex-Wengg in #193
- optionalize TTS via FluidAudioTTS target by @Alex-Wengg in #186
- Expose streaming chunk API and package ESpeakNG xcframework with dSYMs by @Alex-Wengg in #201
- Streaming Diarization Improvements by @SGD2718 in #191
Full Changelog: v0.7.9...v0.7.10
What's Changed
- Update FluidAudio version to 0.7.9 by @BrandonWeng in #192
- Add word-level timestamps support to CLI transcribe command by @Alex-Wengg in #193
- optionalize TTS via FluidAudioTTS target by @Alex-Wengg in #186
- Expose streaming chunk API and package ESpeakNG xcframework with dSYMs by @Alex-Wengg in #201
- Streaming Diarization Improvements by @SGD2718 in #191
- Fix: Move ESpeakNG.xcframework to top-level Frameworks directory by @Alex-Wengg in #205
Full Changelog: v0.7.9...v0.7.10
v0.7.9
What's Changed
- MacOS catalyst support by @BrandonWeng in #181
- add @available(iOS 17.0, *) for VBxClustering and OfflineEmbeddingExt… by @Josscii in #182
- SpeakerManager improvements by @SGD2718 in #180
- Reorganize mac catalyst bundle by @BrandonWeng in #183
- Add Speakmac to showcase by @kiranjd in #187
New Contributors
- @Josscii made their first contribution in #182
- @SGD2718 made their first contribution in #180
- @kiranjd made their first contribution in #187
Full Changelog: v0.7.8...v0.7.9
v0.7.8
Impact:
- Remove shared buffers for diarization pipeline that was causing concurrency crashes, < 3% impact to latency.
- Reduced missing words by 10% when running ASR on long audio files
- Slightly improved WER for v2 and v3 (~0.5% on benchmarks) and ~5% faster!
- Programmatically override the default registry to download from. (i.e hf-mirror.com), useful for Chinese developers
What's Changed
- Make ANE Utils concurrency safe by @BrandonWeng in #172
- Standardize registry override by @BrandonWeng in #175
- Fix Outdated speakermanager doc by @Alex-Wengg in #176
- Switch ASR to stateless for batching by @BrandonWeng in #177
- shoutout to @hamzaq2000 for his help here!
Full Changelog: v0.7.7...v0.7.8
v0.7.7
What's Changed
- Add Intel ESpeakNG support so its easier for others to build by @BrandonWeng in #167
Full Changelog: v0.7.6...v0.7.7
v0.7.6
What's Changed
- Remove unsafe flags from SPM by @BrandonWeng in #166
This should fix SPM issues
Full Changelog: v0.7.5...v0.7.6
v0.7.5
What's Changed
Core ASR & Diarization
- Fixed VAD threshold overriding per segment (#153, #155) — thanks @starcrest
- Added pyannote community-1 model for offline speaker diarization (#150) — @BrandonWeng
Speech Synthesis (ESpeakNG)
- Fixed ESpeakNG framework structure (resolves #159, #160) (#161) — @antonlvovych
- Added ESpeak linking tests (#162) — @BrandonWeng
Dataset & Pipeline Improvements
- Expanded FLEURS dataset coverage to all 25 languages
- Added Hugging Face download retries for robustness (#158) — @BrandonWeng
Internal Cleanups
- General refactors and organization improvements across diarization and data pipelines.
New Contributors
- @starcrest made their first contribution in #153
- @antonlvovych made their first contribution in #161
Full Changelog: v0.7.4...v0.7.5
v0.7.4
What's Changed
- Fix Kokoro File phonic not found issue with xFramework by @BrandonWeng in #151
Should build and run properly now, verified with another developer
Full Changelog: v0.7.2...v0.7.4
v0.7.2
What's Changed
- Update ESpeak and hard fail if missing by @BrandonWeng in #148
- Include header for import files by @BrandonWeng in #149
- Bump min versions to macos 14 and iOS17. This shouldn't matter for most as our models are built against MacOS14 and iOS17..
Full Changelog: v0.7.1...v0.7.2
v0.7.1
What's Changed
- Adding support for phonetic and alias replacement by @smdesai in #140
- Fix ML cache race condition for streaming ASR by @BrandonWeng in #147
New Contributors
Full Changelog: v0.7.0...v0.7.1