Skip to content

I built this project because there was no user friendly way to upload a file to a dockerized flask web form and have whisper do its thing via CLI in the background. Now there is. Enjoy!

Notifications You must be signed in to change notification settings

jlonge4/whisperAI-flask-docker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🎤 Whisper AI Transcription App

A modern, user-friendly Streamlit web interface for OpenAI's Whisper speech-to-text model, containerized with Docker for easy deployment.

✨ Features

  • 🖱️ Drag & Drop Interface: Simply drag audio files into the browser or click to browse
  • 🧠 Model Selection: Choose from 5 different Whisper models based on your accuracy/speed needs:
    • tiny.en - Fastest, least accurate (~39MB)
    • base.en - Good balance (~74MB)
    • small.en - Better accuracy (~244MB)
    • medium.en - High accuracy (~769MB)
    • large - Best accuracy (~1550MB)
  • 📁 Flexible Output: Save files to default location or specify custom paths
  • 📥 Download Buttons: Download transcript files (TXT, SRT, VTT) directly from the web interface
  • 🐳 Docker Volume Support: Map container directories to host machine for persistent file access
  • 💾 Memory Optimized: CPU-only processing with memory-friendly options
  • 📱 Responsive UI: Clean, modern interface that works on desktop and mobile

🚀 Quick Start

Build the Docker Image

git clone https://github.com/jlonge4/whisperAI-flask-docker.git
cd whisperAI-flask-docker/whisperapp
docker build -t whisper-streamlit .

Run the Container

Basic usage (download files via web interface):

docker run -p 8501:8501 whisper-streamlit

With volume mapping (optional - for direct file access):

docker run -p 8501:8501 -v ~/transcripts:./whisper_files whisper-streamlit

With extra memory (for larger models):

docker run --memory=4g -p 8501:8501 -v ~/transcripts:./whisper_files whisper-streamlit

Access the App

Open your browser and go to: http://localhost:8501

🎯 How to Use

  1. Upload Audio: Drag and drop your audio file or click to browse
    • Supported formats: MP3, WAV, MP4, M4A, FLAC, OGG, WMA
  2. Select Model: Choose the Whisper model that fits your needs
  3. Choose Output Location: Use default ./whisper_files or specify custom path
  4. Start Transcription: Click the transcription button and wait for processing
  5. Get Results: Download files instantly using the download buttons (or access via volume mapping if configured)

📂 Output Files

The app generates three transcript formats:

  • .txt - Plain text transcript
  • .srt - SubRip subtitle format with timestamps
  • .vtt - WebVTT subtitle format

🐳 Docker Volume Mapping (Optional)

Volume mapping is completely optional thanks to the built-in download buttons! But if you want direct file system access:

# Map container's ./whisper_files to host's ~/transcripts
docker run -p 8501:8501 -v ~/transcripts:./whisper_files whisper-streamlit

# Map to any host directory
docker run -p 8501:8501 -v /path/to/your/folder:./whisper_files whisper-streamlit

💡 Tips

  • Start with smaller models (tiny.en or base.en) to test functionality
  • Use larger models (medium.en or large) for better accuracy on complex audio
  • Increase Docker memory if you get out-of-memory errors with larger models
  • Download buttons provide instant file access without any setup required
  • Volume mapping is optional - only needed if you want files to persist on your host machine

🔧 System Requirements

  • Docker Desktop (with sufficient memory allocation)
  • At least 2GB RAM (4GB+ recommended for larger models)
  • ARM64 or AMD64 architecture support

🆕 What's New (Streamlit Version)

This project has been upgraded from Flask to Streamlit with major improvements:

  • Modern drag-and-drop interface
  • Real-time model selection
  • Built-in download functionality
  • Better error handling and user feedback
  • Memory optimization for Docker containers
  • Multi-platform Docker support (ARM64/AMD64)

Original concept: A user-friendly way to upload files to a dockerized Flask web form and have Whisper transcribe them. Now with a modern Streamlit interface! 🎉

About

I built this project because there was no user friendly way to upload a file to a dockerized flask web form and have whisper do its thing via CLI in the background. Now there is. Enjoy!

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published