-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Now that you have one voice, you can use polly to generate new sounds, too.
Steal ideas from somleng/freeswitch-config@bf7ed3c
They use mod_ssml as a way to provide more voices?
https://github.com/signalwire/freeswitch/blob/master/conf/rayo/autoload_configs/ssml.conf.xml
Specifically, configure mod_tts_commandline to use polly.
It's used by:
<action application="speak" data="tts_commandline|Matthew|This is an example of using tts_commandline"/>
So we can configure to use one voice like so:
- Edit conf/autoload_configs/tts_commandline.conf.xml.
Set the command to:
aws polly synthesize-speech --output-format pcm --voice-id "$voice_id" --engine neural --text "$text" "$file"
If you are using a rate over 16000, then amazon can't match that, even for neural, and will provide an error. Some way to do more accurate matching on $rate ? If we just leave it out, we get the best they have and FS will down-sample it. Maybe pcm is overkill and we can do ogg, to save on bandwidth/latency?
(see docs)
Obviously, you have to have the aws sdk installed and authorization configured for this to work.