Skip to content

Generate high-quality speech from text using the powerful Kokoro TTS pipeline with an intuitive web interface.

Notifications You must be signed in to change notification settings

WilleIshere/KokoroTTSGenerator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

41 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Kokoro TTS Generator

Kokoro TTS Generator

High Quality Local Text-to-Speech Generator

Python UV License Platform Release Speed Quality

Generate high-quality speech from text using the powerful Kokoro TTS pipeline with an intuitive web interface.

๐Ÿš€ Quick Start โ€ข ๐Ÿ“ฆ Download โ€ข ๐Ÿ”ง Build โ€ข ๐Ÿ“š Documentation โ€ข ๐Ÿค Contributing


โœจ Features

๐ŸŽฏ Core Functionality

  • High-Quality TTS: Powered by the advanced Kokoro TTS pipeline
  • Multiple Voices: Choose from a wide variety of natural-sounding voices
  • Customizable Output: Adjust speech speed and pitch with precision
  • Batch Processing: Generate audio from multi-paragraph text input with natural pauses
  • Real-time Preview: Instant audio playback within the interface

๐Ÿ–ฅ๏ธ User Interface

  • Modern Design: Built with NiceGUI for a sleek, responsive web interface
  • Intuitive Controls: Simple, user-friendly experience
  • Progress Indicators: Visual feedback for pipeline loading and audio generation
  • Dark Mode: Easy on the eyes for extended use
  • Responsive Layout: Works across devices and screen sizes

๐Ÿ’พ File Management

  • WAV Format: High-quality audio output
  • Automatic Naming: Unique identifiers for each generated file
  • Local Processing: All data processed on your machine for privacy
  • Cross-Platform: Works on Windows and Linux

๐Ÿš€ Quick Start

Option 1: Using Compiled Versions (Recommended)

  1. Download the appropriate release for your platform from the Releases page
  2. Extract the downloaded archive (if applicable)
  3. Run the application:
    • Windows: Double-click the .exe file
    • Linux: Make the AppImage executable (chmod +x KokoroTTSGenerator.AppImage) and run it
  4. Wait for the TTS pipeline to initialize on first run (may take a few minutes)

Option 2: Running from Source

# Clone the repository
git clone https://github.com/WilleIshere/KokoroTTSGenerator.git
cd KokoroTTSGenerator

# Create virtual environment and install dependencies with UV
uv venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
uv sync

# Run the application
python app.py

๐Ÿ“ฆ Download

System Requirements

  • Operating System: Windows 10/11 or Linux
  • Memory: 4GB RAM minimum (8GB recommended)
  • Storage: 2GB+ free space for model files and audio output
  • Internet: Required for initial model download
  • No Python installation needed for compiled versions

Latest Release

  • Version: 0.1.0
  • Formats:
    • Windows: Standalone executable (.exe) - no installation required

    • Linux: AppImage (.AppImage) - runs anywhere

    • Source: Python package (requires Python 3.12 & UV)

  • Size: ~300MB (includes all dependencies and runtime)

โฌ‡๏ธ Download Latest Release

Note: For the compiled versions, no Python installation or additional dependencies are required. Everything is bundled in the executable.

๐ŸŽฎ Usage

Getting Started

  1. First Launch: Wait for the TTS pipeline to initialize (first run only)
  2. Select Voice: Choose from available voices in the dropdown
  3. Adjust Parameters: Set speech speed and pitch using the sliders
  4. Enter Text: Type or paste text into the text area
  5. Generate: Click "Generate Audio" to create speech
  6. Enjoy: Preview the audio directly in the app and download the WAV file

Voice Options

The application includes a variety of high-quality voices:

  • Female Voices: af_alloy, af_aoede, af_bella, af_jessica, af_kore, af_nicole, af_nova, af_river, af_sarah, af_sky
  • Male Voices: am_adam, am_echo, am_eric, am_fenrir, am_liam, am_michael, am_onyx, am_puck, am_santa

All voices are included in the compiled versions - no additional downloads required.

Tips & Tricks

  • Long Text: Break long texts into paragraphs for better processing
  • Punctuation: Use proper punctuation for natural speech rhythm
  • Speed & Pitch: Experiment with different settings for optimal results
  • Browser Compatibility: Works best in modern browsers

๐Ÿ—๏ธ Project Structure

This project has been architected with a modular design for maintainability and extensibility:

KokoroTTSGenerator/
โ”œโ”€โ”€ ๐Ÿš€ app.py                    # Main entry point
โ”œโ”€โ”€ ๐Ÿ“ src/                      # Source code
โ”‚   โ”œโ”€โ”€ gui.py                   # Web interface implementation
โ”‚   โ””โ”€โ”€ tts.py                   # TTS pipeline implementation
โ”œโ”€โ”€ ๐Ÿ“ final_audio/              # Output directory for generated audio
โ”œโ”€โ”€ ๐Ÿ“ temp/                     # Temporary working directory
โ”œโ”€โ”€ ๐Ÿ“„ pyproject.toml            # Dependencies and project configuration
โ””โ”€โ”€ ๐Ÿ“„ uv.lock                   # UV dependencies lockfile

Architecture Highlights

  • Modern Web Interface: Built with NiceGUI for a responsive experience
  • Efficient Pipeline: Fast, high-quality audio generation
  • Clean Separation: UI and TTS logic kept separate for maintainability
  • Python-powered: Leverages the best Python libraries for TTS

๐Ÿ”ง Building from Source

Prerequisites

# Ensure you have Python 3.12 installed
python --version

# Clone the repository
git clone https://github.com/WilleIshere/KokoroTTSGenerator.git
cd KokoroTTSGenerator

Development Setup

# Create virtual environment and install dependencies with UV
uv venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
uv sync

# For development tools
uv pip install -e ".[dev]"

Running the Application

uv run app.py

๐Ÿ“š Documentation

  • Project Structure: Simple, modular design for easy maintenance
  • Kokoro TTS: Leverages the powerful Kokoro TTS pipeline
  • NiceGUI: Built with a modern web interface framework
  • Compiled Versions: Standalone executables for all platforms

๐Ÿค Contributing

We welcome contributions! Here's how you can help:

Development Setup

# Fork and clone the repository
git clone https://github.com/yourusername/KokoroTTSGenerator.git
cd KokoroTTSGenerator

# Create virtual environment and install dependencies
uv venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
uv sync
uv pip install -e ".[dev]"

Ways to Contribute

  • ๐Ÿ› Bug Reports: Found an issue? Please open an issue
  • ๐Ÿ’ก Feature Requests: Have an idea? We'd love to hear it
  • ๐Ÿ”ง Code Contributions: Submit a pull request
  • ๐Ÿ“š Documentation: Help improve our docs

Development Guidelines

  • Follow PEP 8 style guidelines
  • Add tests for new features
  • Update documentation for changes
  • Ensure cross-platform compatibility

๐Ÿ› ๏ธ Technologies

  • Frontend: NiceGUI (Python web interface framework)
  • TTS Engine: Kokoro TTS pipeline (v0.9.4+)
  • Audio: soundfile, numpy
  • Package Management: UV (Fast, reliable Python package manager)
  • Dependencies: kokoro, nicegui, torch, soundfile
  • Distribution: Standalone executables for Windows and Linux

๐Ÿ“ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

  • Kokoro TTS: Amazing TTS pipeline that powers this application
  • NiceGUI: Beautiful modern web interface framework
  • Python Community: For the incredible ecosystem of libraries

๐Ÿ“ž Support

Getting Help

  • ๐Ÿ“– Documentation: Check our comprehensive docs
  • ๐Ÿ› Issues: Report bugs or request features on GitHub
  • ๐Ÿ’ฌ Discussions: Community Q&A and general discussion

Common Issues

  • First Run Slow: Initial pipeline loading downloads models and may take a few minutes
  • Memory Usage: TTS models require significant RAM; 8GB recommended for optimal performance
  • Antivirus Warnings: Some antivirus software may flag compiled executables; these are false positives
  • Linux Permissions: On Linux, remember to make AppImage files executable before running

๐Ÿ”„ Version History

v0.1.0 (Latest)

  • โœจ Initial release with core functionality
  • ๐ŸŽฏ Multiple voice options
  • ๐ŸŽ›๏ธ Speed and pitch controls
  • ๐ŸŽฎ Web-based user interface
  • ๐Ÿ”Š High-quality audio output
  • ๐Ÿ“ฆ Compiled versions for Windows and Linux

Made with โค๏ธ by WilleIshere

โญ Star this repo โ€ข ๐Ÿด Fork it โ€ข ๐Ÿ“ Report Issues

About

Generate high-quality speech from text using the powerful Kokoro TTS pipeline with an intuitive web interface.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages