AI Voice Generator

An advanced text-to-speech application that converts text to natural-sounding speech using multiple AI speech engines.

High Level Design

Frontend ui

Features

🎙️ Dual speech engines: Azure Cognitive Services and Google Cloud Text-to-Speech
🌐 Support for multiple languages (12+ languages including English, Spanish, French, etc.)
👨‍👩‍👧‍👦 Multiple voice options for each language and gender
🔊 High-quality neural voice synthesizing
💾 Download synthesized speech as MP3 files
🎨 Beautiful, responsive UI with glass-morphism design
✨ Interactive animations and visual feedback

Tech Stack

Frontend

React 19
Framer Motion (animations)
React Icons
Modern CSS with glassmorphism effects
Vite (build tool)

Backend

Node.js
Express
Azure Speech SDK
Google Cloud Text-to-Speech API

Installation

Prerequisites

Node.js (v14 or later)
npm or yarn
Azure Cognitive Services account
Google Cloud Platform account with Text-to-Speech API enabled

Setup

Clone the repository

git clone https://github.com/yourusername/ai-voice-generator.git
cd ai-voice-generator

Install dependencies

# Install frontend dependencies
cd client
npm install

# Install backend dependencies
cd ../server
npm install

Create environment variables

# In the server directory, create a .env file
touch .env

Add the following environment variables:

AZURE_SPEECH_KEY=your_azure_key
AZURE_SERVICE_REGION=your_azure_region
GOOGLE_APPLICATION_CREDENTIALS=path_to_google_credentials.json
PORT=5000

Usage

Starting the Application

Start the backend server
```
cd server
npm start
```
In a new terminal, start the frontend
```
cd client
npm run dev
```
Open your browser and navigate to http://localhost:5173/

Converting Text to Speech

Type or paste your text in the input field
Select your preferred language
Choose a voice type (male/female)
Select the speech engine (Azure or Google)
Click the "Generate Speech" button
Use the player controls to listen to the generated speech
Click "Download" to save the audio as an MP3 file

API Reference

Backend Endpoints

Generate Speech

POST http://localhost/text-to-speech

Body:

{
  "text": "Text to convert to speech",
  "language": "en-US",
  "voice": "en-US-GuyNeural",
  "engine": "azure"
}

Get Available Voices

GET /api/voices?engine=azure&language=en-US

Configuration

Azure Cognitive Services

Create an Azure account at portal.azure.com
Create a Speech Service resource
Copy the key and region to your .env file

Google Cloud Text-to-Speech

Create a Google Cloud account at console.cloud.google.com
Enable the Text-to-Speech API
Create a service account and download the credentials JSON file
Set the path to this file in your .env

Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Troubleshooting

Common Issues

API Key errors: Ensure your environment variables are correctly set
Voice not loading: Check your internet connection and API quotas
Audio playback issues: Try using a different browser

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Azure Cognitive Services for their speech synthesis API
Google Cloud for their Text-to-Speech technology
The React community for the amazing frontend tools

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
backend		backend
frontend		frontend
README.md		README.md
image-1.png		image-1.png
image.png		image.png
ssml-reference.md		ssml-reference.md
vercel.json		vercel.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AI Voice Generator

High Level Design

Frontend ui

Features

Tech Stack

Frontend

Backend

Installation

Prerequisites

Setup

Usage

Starting the Application

Converting Text to Speech

API Reference

Backend Endpoints

Generate Speech

Get Available Voices

Configuration

Azure Cognitive Services

Google Cloud Text-to-Speech

Contributing

Troubleshooting

Common Issues

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Languages

MehraDevesh2022/text-to-speech

Folders and files

Latest commit

History

Repository files navigation

AI Voice Generator

High Level Design

Frontend ui

Features

Tech Stack

Frontend

Backend

Installation

Prerequisites

Setup

Usage

Starting the Application

Converting Text to Speech

API Reference

Backend Endpoints

Generate Speech

Get Available Voices

Configuration

Azure Cognitive Services

Google Cloud Text-to-Speech

Contributing

Troubleshooting

Common Issues

License

Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages