A production-ready, conversational AI voice bot that bridges Exotel's WebSocket streaming with OpenAI's Realtime API for natural, speech-to-speech conversations over phone calls.
- π£οΈ Natural Conversations: Real-time speech-to-speech using OpenAI's latest Realtime API
- π Telephony Integration: Seamless integration with Exotel's voice streaming services
- π Smart Interruption: Handles conversation interruptions naturally
- π Audio Enhancement: Built-in noise suppression and audio optimization for telephony
- β‘ Real-time Processing: 200ms audio buffering for smooth conversation flow
- π Security First: Environment-based configuration, no hardcoded secrets
- π΅ High-Quality Audio: 24kHz PCM16 audio format for superior voice quality
- Python 3.8+
- OpenAI API key with Realtime API access
- Exotel account with Voicebot Applet access
-
Clone the repository:
git clone <repository-url> cd Agent-Stream
-
Create a virtual environment:
python3 -m venv venv source venv/bin/activate # Windows: venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Configure environment variables:
cp env.example .env # Edit .env with your OpenAI API key and other settings
Edit your .env file with the following required settings:
# REQUIRED - Get from OpenAI dashboard
OPENAI_API_KEY=your-openai-api-key-here
# SERVER CONFIG
SERVER_HOST=0.0.0.0
SERVER_PORT=5000
# BOT PERSONALITY
COMPANY_NAME=Your Company Name
SALES_BOT_NAME=Sarah
# AUDIO SETTINGS
SAMPLE_RATE=24000
AUDIO_CHUNK_SIZE=200# Start the bot
python main.py
# Test the configuration
python main.py --config-check
# Run system tests
python main.py --testThe bot will start a WebSocket server on 0.0.0.0:5000.
- Local:
ws://localhost:5000 - Public: Use ngrok or your server's public IP
- URL:
wss://your-domain.com/?sample-rate=24000 - Sample Rate: 24kHz (recommended for high quality)
- Audio Format: Raw/slin (16-bit PCM)
- Bidirectional Streaming: Enabled
{
"event": "connected"
}.
βββ .env # Environment variables (local)
βββ .gitignore # Git ignore file
βββ LICENSE # Project license
βββ README.md # This README file
βββ config.py # Centralized configuration
βββ core/ # Core bot logic and framework
β βββ __init__.py
β βββ bot_framework.py
β βββ openai_realtime_sales_bot.py
βββ engines/ # AI engine components (STT, TTS, NLP, etc.)
β βββ __init__.py
β βββ audio_enhancer.py
β βββ media_resampler.py
β βββ nlp_engine.py
β βββ stt_engine.py
β βββ tts_engine.py
βββ env.example # Example environment variables
βββ main.py # Main entry point for the application
βββ requirements.txt # Python dependencies
βββ venv/ # Python virtual environment
- SAMPLE_RATE: Audio sample rate (8000, 16000, 24000)
- AUDIO_CHUNK_SIZE: Chunk size in milliseconds (default: 200)
- BUFFER_SIZE_MS: Audio buffer size (default: 160)
- COMPANY_NAME: Your company name
- SALES_BOT_NAME: Bot's name
- OPENAI_VOICE: Voice selection (coral, nova, shimmer)
- SERVER_HOST: Server host (default: 0.0.0.0)
- SERVER_PORT: Server port (default: 5000)
# Install ngrok
./ngrok http 5000
# Use: wss://xxxxx.ngrok-free.app# Setup on DigitalOcean/AWS/GCP
sudo ufw allow 5000
python main.py
# Use: wss://your-server-ip:5000# Build and run
docker build -t voice-bot .
docker run --env-file .env -p 5000:5000 voice-bot# Test configuration
python main.py --config-check
# Test bot connection
python main.py --test# Using wscat
wscat -c ws://localhost:5000
# Send test message
{"event": "connected"}- All sensitive information is stored in environment variables
- No hardcoded API keys or tokens
.envfile is gitignored- Use HTTPS/WSS in production
- Call Duration
- Bot Response Time
- Audio Quality Score
- Conversation Completion Rate
- Error Rate
# Monitor key events
grep "NEW EXOTEL CONNECTION" logs/bot.log
grep "CONVERSATION COMPLETED" logs/bot.log
grep "ERROR" logs/bot.log- Fork the repository
- Create a feature branch
- Make your changes
- Test thoroughly
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
- Check the troubleshooting section in the README
- Review GitHub Issues for similar problems
- Post detailed issues with logs and configuration
- Exotel for voice streaming services
- OpenAI for Realtime API
- Agent-Stream for inspiration