GitHub - joshooaj/MuxMinus: Break down any song into individual stems for vocals, drums, bass, guitar, piano, and more. Powered by Demucs from Facebook Research.

🎵 Separate Music Stems & Transcribe Speech with AI

Break down any song into individual stems for vocals, drums, bass, guitar, piano, and more.
Convert audio and video to text with AI-powered transcription, subtitles, and lyrics generation.
Powered by Demucs and Whisper from Facebook Research and OpenAI.

Features • How It Works • AI Models • Transcription • Screenshots • Tech Stack

✨ Features


🎯 High Quality	Powered by Demucs and Whisper AI from Facebook Research and OpenAI
⚡ Fast Processing	No need to install anything — just upload your files
🎨 Multiple Options	Stem separation (2, 4, or 6 stems) and speech transcription
📝 Transcription	Convert speech to text, generate subtitles, or extract lyrics
🔒 Privacy First	Files automatically deleted after 24 hours
💰 Pay As You Go	No subscriptions — start with 3 free credits
🌐 API Access	Integrate processing into your workflow (coming soon)

🚀 How It Works

1️⃣ Upload

Upload any audio or video file — MP3, WAV, FLAC, MP4, and more. We support all common formats.

2️⃣ Choose

Select the AI model and processing type: stem separation, transcription, subtitles, or lyrics.

3️⃣ Download

Get your separated stems in MP3 or high-quality WAV format.

🤖 AI Models

Multiple spectrogram/waveform separation models are available through Demucs for different tasks.

4-Stem Separation

Separate your music into:

🎤 Vocals
🥁 Drums
🎸 Bass
🎹 Other instruments

Models: htdemucs, htdemucs_ft

6-Stem Separation

Get even more control with:

🎤 Vocals
🥁 Drums
🎸 Bass
🎸 Guitar
🎹 Piano
🎵 Other

Model: htdemucs_6s

2-Stem Separation

Quick isolation of a single element:

🎤 Vocals + Everything Else
🥁 Drums + Everything Else
🎸 Bass + Everything Else

Models: htdemucs, htdemucs_ft, htdemucs_6s

🎙️ Transcription

Powered by OpenAI's Whisper model, Mux Minus now supports speech-to-text transcription with multiple output formats.

Basic Transcription (1 credit)

Convert speech from audio or video files to plain text:

📝 Automatic language detection
🌍 Supports 99+ languages
📄 Output: Plain text (.txt)

Timestamped Transcription (1 credit)

Get transcription with precise timestamps:

⏱️ Segment-level timestamps
📊 JSON format with metadata
🎬 Perfect for video chapters

Subtitle Generation (1 credit)

Generate subtitle files for videos:

📺 SRT format (SubRip)
🌐 WebVTT format
✅ Ready for video players

Lyrics from Music (2 credits)

Extract timestamped lyrics from songs:

🎤 Two-step pipeline: vocals isolation + transcription
🎵 LRC format with timestamps
🎼 Better accuracy than transcribing full mix
📀 Includes isolated vocals audio file

Supported Formats:

Audio: MP3, WAV, FLAC, OGG, M4A, AAC
Video: MP4, MKV, AVI, MOV, WebM
File size limit: 5GB

📸 Screenshots

Landing Page

Interactive Demo

Job Creation

Results & Playback

🔬 About Demucs

Mux Minus is built on top of Demucs, an open-source audio source separation model created by Facebook AI Research (FAIR).

Demucs uses a hybrid approach combining waveform and spectrogram processing with deep learning to achieve state-of-the-art results in music source separation.

Run Demucs Yourself

If you'd prefer to run Demucs locally on your own computer, you can! Here's how:

Installation

# Install with pip
pip install demucs

# Or with conda
conda install -c conda-forge demucs

Basic Usage

# Separate a song into 4 stems (vocals, drums, bass, other)
demucs your-song.mp3

# Use the 6-stem model (adds guitar and piano)
demucs --two-stems=vocals your-song.mp3  # Just vocals + accompaniment
demucs -n htdemucs_6s your-song.mp3      # Full 6-stem separation

Output

By default, Demucs creates a separated folder with subfolders for each model and track:

separated/
└── htdemucs/
    └── your-song/
        ├── vocals.wav
        ├── drums.wav
        ├── bass.wav
        └── other.wav

For full documentation, visit the Demucs GitHub repository.

🛠️ Tech Stack

Mux Minus is built with modern, production-ready technologies:

Frontend

Technology	Purpose
Django	Web framework & templating
Vanilla JS	Interactive components
WaveSurfer.js	Waveform visualization & audio playback
CSS3	Modern styling with CSS variables

Backend

Technology	Purpose
Django	REST API, user management, administration
FastAPI	Internal backend service for job processing
Demucs	AI-powered audio separation
PostgreSQL	Production database
SQLite	Development database

Infrastructure

Technology	Purpose
Docker	Containerization
Docker Compose	Multi-container orchestration
WhiteNoise	Static file serving
Traefik	Reverse proxy (production)

Payments

Technology	Purpose
Square	Payment processing

📄 License

This project is open source. See the LICENSE file for details.

_{Built with ❤️ using Demucs
and Copilot (Claude Opus 4.5)}

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
.github		.github
.vscode		.vscode
app		app
backend		backend
.env.example		.env.example
.gitignore		.gitignore
.markdownlint.json		.markdownlint.json
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CREATE_JOB_UPDATE.md		CREATE_JOB_UPDATE.md
LICENSE		LICENSE
README.md		README.md
TRANSCRIPTION_FEATURE.md		TRANSCRIPTION_FEATURE.md
docker-compose.yml		docker-compose.yml

License

joshooaj/MuxMinus

Folders and files

Latest commit

History

Repository files navigation