An advanced text-to-speech application that converts text to natural-sounding speech using multiple AI speech engines.
- 🎙️ Dual speech engines: Azure Cognitive Services and Google Cloud Text-to-Speech
- 🌐 Support for multiple languages (12+ languages including English, Spanish, French, etc.)
- 👨👩👧👦 Multiple voice options for each language and gender
- 🔊 High-quality neural voice synthesizing
- 💾 Download synthesized speech as MP3 files
- 🎨 Beautiful, responsive UI with glass-morphism design
- ✨ Interactive animations and visual feedback
- React 19
- Framer Motion (animations)
- React Icons
- Modern CSS with glassmorphism effects
- Vite (build tool)
- Node.js
- Express
- Azure Speech SDK
- Google Cloud Text-to-Speech API
- Node.js (v14 or later)
- npm or yarn
- Azure Cognitive Services account
- Google Cloud Platform account with Text-to-Speech API enabled
-
Clone the repository
git clone https://github.com/yourusername/ai-voice-generator.git cd ai-voice-generator -
Install dependencies
# Install frontend dependencies cd client npm install # Install backend dependencies cd ../server npm install
-
Create environment variables
# In the server directory, create a .env file touch .envAdd the following environment variables:
AZURE_SPEECH_KEY=your_azure_key AZURE_SERVICE_REGION=your_azure_region GOOGLE_APPLICATION_CREDENTIALS=path_to_google_credentials.json PORT=5000
-
Start the backend server
cd server npm start -
In a new terminal, start the frontend
cd client npm run dev -
Open your browser and navigate to
http://localhost:5173/
- Type or paste your text in the input field
- Select your preferred language
- Choose a voice type (male/female)
- Select the speech engine (Azure or Google)
- Click the "Generate Speech" button
- Use the player controls to listen to the generated speech
- Click "Download" to save the audio as an MP3 file
POST http://localhost/text-to-speech
Body:
{
"text": "Text to convert to speech",
"language": "en-US",
"voice": "en-US-GuyNeural",
"engine": "azure"
}GET /api/voices?engine=azure&language=en-US
- Create an Azure account at portal.azure.com
- Create a Speech Service resource
- Copy the key and region to your .env file
- Create a Google Cloud account at console.cloud.google.com
- Enable the Text-to-Speech API
- Create a service account and download the credentials JSON file
- Set the path to this file in your .env
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- API Key errors: Ensure your environment variables are correctly set
- Voice not loading: Check your internet connection and API quotas
- Audio playback issues: Try using a different browser
This project is licensed under the MIT License - see the LICENSE file for details.
- Azure Cognitive Services for their speech synthesis API
- Google Cloud for their Text-to-Speech technology
- The React community for the amazing frontend tools

