Online Speech Translator

A real-time speech-to-text and translation tool with a live terminal interface.

Built on top of the excellent KoljaB/RealtimeSTT library.

Features

Real-time speech recognition from microphone (with available input device listing)
Asynchronous translation using deep-translator (Google Translate) or OpenAI
Live updating table in the terminal using rich
Device selection and device listing via command line
Proxy support for translation requests
Language selection for both input and translation
Support for multiple languages
Logging of transcripts and translations (see app.log, transcript.log, transcript_with_translation.log)
Context management for improved translation accuracy
Modular translation and rendering backends (see translator/ and renderer/)

Project Structure

├── realtime_stt.py                  # Main logic for real-time speech translation
├── input_devices.py                 # Audio device utilities
├── translator/                      # Translation interfaces and implementations
│   ├── __init__.py
│   ├── base.py
│   ├── factory.py
│   ├── google_translator.py
│   └── openai_translator.py
├── compressor/                      # Context compressor for OpenAI
│   ├── __init__.py
│   ├── base.py
│   └── openai_compressor.py
├── renderer/                        # Output rendering (terminal, HTML, etc.)
│   ├── __init__.py
│   ├── base.py
│   ├── factory.py
│   ├── html_fastaip_renderer.py
│   └── rich_render.py
├── requirements.txt                 # Project dependencies
├── readme.md                        # Documentation
├── app.log                          # Application log
├── transcript.log                   # Raw transcript log
├── transcript_with_translation.log  # Transcript with translation log
├── LICENSE                          # License file

Quickstart

1. Create a Python Environment

Option A: Using Conda

conda create -n online_speech_translate python=3.12
conda activate online_speech_translate

Option B: Using venv

python3 -m venv .venv
source .venv/bin/activate  # On Windows use: .venv\Scripts\activate

2. Install Dependencies

Linux Installation

Before installing dependencies, run:

sudo apt-get update
sudo apt-get install python3-dev portaudio19-dev

MacOS Installation

Before installing dependencies, run:

brew install portaudio

Then install Python dependencies:

git clone https://github.com/nikkiw/realtime_translator.git
cd realtime_translator
pip install -r requirements.txt

3. List Available Audio Devices

python realtime_stt.py --list_devices

4. Run the Application

To run the real-time speech translator, execute:

python realtime_stt.py --input_device_index <device_index> --input_lang <source_language> --translate_lang <target_language>

Command Line Arguments

--input_device_index: Index of the audio input device (default is 0).
--input_lang: Language spoken by the speaker (2-letter code, e.g., 'en', one of: en, ru, de, pt, it, es, fr).
--translate_lang: Target translation language (2-letter code, e.g., 'ru', one of: en, ru, de, pt, it, es, fr).
--proxy: Optional proxy URL for translation requests.
--translator_type: Type of translator to use ('google' or 'openai').
--openai_api_key: OpenAI API key for using the OpenAI translator.
--renderer: Output rendering engine. Options:
- rich: Shows a live-updating, color table in the terminal (recommended for CLI use).
- html_fastaip: Outputs results to an HTML page (convenient for web integration or browser viewing, url: http://127.0.0.1:8090).
--log_level: Logging level (default: INFO).
--list_devices: List available audio devices and exit.

Example with Proxy and HTML Renderer

python realtime_stt.py --input_device_index 0 --input_lang en --translate_lang ru --renderer html_fastaip --proxy http://user:pass@host:port

Renderer Options

rich: Displays results in a modern, colorized table directly in your terminal. Supports live updates and is ideal for command-line workflows.
html_fastaip: Renders results to an HTML file for viewing in a browser or embedding in web apps. Useful for sharing or integrating with other tools.

Requirements

Python 3.12
See requirements.txt for Python dependencies

Logs

app.log: General application log
transcript.log: Raw recognized text
transcript_with_translation.log: Recognized text with translation

Contributing

Contributions are welcome! Please submit a pull request or open an issue for any enhancements or bug fixes.

License

This project is licensed under the MIT License. See the LICENSE file for more details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Online Speech Translator

Features

Project Structure

Quickstart

1. Create a Python Environment

Option A: Using Conda

Option B: Using venv

2. Install Dependencies

Linux Installation

MacOS Installation

3. List Available Audio Devices

4. Run the Application

Command Line Arguments

Example with Proxy and HTML Renderer

Renderer Options

Requirements

Logs

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
compressor		compressor
renderer		renderer
translator		translator
.gitignore		.gitignore
LICENSE		LICENSE
__init__.py		__init__.py
commit_info.sh		commit_info.sh
input_devices.py		input_devices.py
readme.md		readme.md
realtime_stt.py		realtime_stt.py
requirements.txt		requirements.txt

License

nikkiw/realtime_translator

Folders and files

Latest commit

History

Repository files navigation

Online Speech Translator

Features

Project Structure

Quickstart

1. Create a Python Environment

Option A: Using Conda

Option B: Using venv

2. Install Dependencies

Linux Installation

MacOS Installation

3. List Available Audio Devices

4. Run the Application

Command Line Arguments

Example with Proxy and HTML Renderer

Renderer Options

Requirements

Logs

Contributing

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages