Easy speech synthesis with Kokoro, with subtitle support, in Python.
- Features
- Requirements
- Installation
- Examples
- Usage
- Command Line Interface (CLI)
- Example Output Files
- Build from Source
- API
- License
- Simpler interface for generating speech audio and subtitles
- Easier Subtitles with Sentance and Word Timestamps
- Supports all Kokoro Voices with a Simpler Voice System
- Automatic Model Management
- More Readable for Beginners
- Python 3.10+
- torch
- kokoro
- soundfile
All dependencies except Python are installed automatically.
From PyPI:
pip install Simpler-KokoroOr clone the repo and install locally:
git clone https://github.com/WilleIshere/SimplerKokoro.git
cd SimplerKokoro
pip install .You can find runnable example scripts in the examples/ folder:
basic_example.py: Basic usage, generate speech from text.subtitles_example.py: Generate speech with SRT subtitles.custom_speed_example.py: Generate speech with custom speed.custom_models_dir_example.py: Specify a custom directory for model downloads.
Basic Example
from Simpler_Kokoro import SimplerKokoro
# Create an instance
sk = SimplerKokoro()
# Load the available voices
voices = sk.list_voices()
# (optional) Print out the voices
for voice in voices:
print(voice) # Print out the voice object
# Use the first voice as example
selected_voice = voices[0]
# Generate speech
sk.generate(
text='Hello, this is a test of the Simpler Kokoro voice synthesis.', # Text to generate
voice=selected_voice.name, # Grab the name from the selected voice
output_path='output.wav' # Select the output path.
)Generate Speech with Subtitles
from Simpler_Kokoro import SimplerKokoro
# Create an instance
sk = SimplerKokoro()
# Load the available voices
voices = sk.list_voices()
# Use the first voice as example
selected_voice = voices[0]
# Generate speech
sk.generate(
text='Hello, this will generate a subtitles.srt file along with output.wav', # Text to generate
voice=selected_voice.name, # Grab the name from the selected voice
output_path='output.wav', # Select the output path
write_subtitles=True, # Enable subtitle generation
subtitles_path='subtitles.srt', # (optional) Specify the subtitle .srt filename
subtitles_word_level=True # (optional) Enable word level timestamps
)Generate Speech with Custom Speed
from Simpler_Kokoro import SimplerKokoro
# Create an instance
sk = SimplerKokoro()
# Load the available voices
voices = sk.list_voices()
# Use the first voice as example
selected_voice = voices[0]
# Generate speech
sk.generate(
text='Hello, this is a test of the Simpler Kokoro voice synthesis.', # Text to generate
voice=selected_voice.name, # Grab the name from the selected voice
output_path='output.wav', # Select the output path
speed=1.5 # This represents 150% Speed. 1 means 100% and 0.5 means 50%
)Specify a Path to Download Models
from Simpler_Kokoro import SimplerKokoro
# Create an instance
sk = SimplerKokoro(models_dir='<PATH TO PUT MODELS>') # Put in the path where you want the models to be saved here
# Load the available voices
voices = sk.list_voices()
# Use the first voice as example
selected_voice = voices[0]
# Generate speech
sk.generate(
text='Select a custom directory for the models!', # Text to generate
voice=selected_voice.name, # Grab the name from the selected voice
output_path='output.wav' # Select the output path.
)You can use the library in the command line too.
Example:
python -m Simpler_Kokoro <command> [options]| Command | Description | Options |
|---|---|---|
| list-voices | List available Kokoro voices | --repo, --models_dir, --log_level |
| generate | Generate speech audio from text | --text (required), --voice (required), --output (required), --speed, --write_subtitles, --subtitles_path, --subtitles_word_level, --repo, --models_dir, --log_level |
Global options:
| Option | Description | Default |
|---|---|---|
| --repo | HuggingFace repo to use for models | hexgrad/Kokoro-82M |
| --models_dir | Directory to store model files | models |
| --log_level | Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL) | INFO |
Generate command options:
| Option | Description | Default |
|---|---|---|
| --text | Text to synthesize (required) | |
| --voice | Voice name to use (required) | |
| --output | Output WAV file path (required) | |
| --speed | Speech speed multiplier | 1.0 |
| --write_subtitles | Write SRT subtitles | False |
| --subtitles_path | Path to save subtitles | subtitles.srt |
| --subtitles_word_level | Word-level subtitles | False |
output.wav: The synthesized speech audio file.output.srt: Subtitles in SRT format (ifwrite_subtitles=True).
Sample SRT output
1
00:00:00,000 --> 00:00:01,200
Hello,
2
00:00:01,200 --> 00:00:02,500
this is a test.
3
00:00:02,500 --> 00:00:04,000
This is another sentence.
To build the package from source:
git clone https://github.com/WilleIshere/SimplerKokoro.git
cd SimplerKokoro
pip install build
python -m buildThis will create distribution files in the dist/ directory:
.whl(wheel) file for pip installation.tar.gzsource archive
To install the built wheel locally:
pip install dist/Simpler_Kokoro-*.whlYou can now use the package as described in the usage section.
list_voices(): Returns a list of available voices with metadata.generate(text, voice, output_path, speed=1.0, write_subtitles=False, subtitles_path='subtitles.srt', subtititles_word_level=False): Generates speech audio and optional subtitles.
This project is licensed under the GPL-3.0 license.
