Skip to content

guynich/demultiplex_music

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

demultiplex music

The demucs ML model separates vocals, drums, bass and other tracks from music.

The source repo licence is the permissive MIT.

Installating demucs

Tested on MacBook Pro 2020 (Intel x86) with macOS 15.5.

Python version

I used pyenv repo to install a suitable Python version >= 3.8 on macOS. The model uses PyTorch which does not support Python 3.13 as of June 14, 2025.

python3 --version
Python 3.12.3

Virtual environment

Create a virtual environment. I used Python's venv for this.

cd
python3 -m venv venv_demucs
source ./venv_demucs/bin/activate

python3 -m pip install --upgrade pip

Install demucs PyPI package in the virtual environment.

python3 -m pip install -U demucs

On macOS I hit a numpy version error triggered in /torch/nn/modules/transformer.py. This error did not duplicate on Ubuntu 22.04.05 with numpy version 2.2.26.

demucs
A module that was compiled using NumPy 1.x cannot be run in
NumPy 2.3.0 as it may crash. To support both 1.x and 2.x
versions of NumPy, modules must be compiled with NumPy 2.0.
Some module may need to rebuild instead e.g. with 'pybind11>=2.12'.

If you are a user of the module, the easiest solution will be to
downgrade to 'numpy<2' or try to upgrade the affected module.
We expect that some modules will need time to support NumPy 2.

On macOS I downgraded numpy in the virtual environment to version 1.26.4.

python3 -m pip install "numpy<2"

Runs

Run from command line in the virtual environment.

cd
source ./venv_demucs/bin/activate

demucs <path_to_music_file>

Run1

I tested on a mono file with 1 track with 48kHz sampling rate. A random file in my downloads folder. The run took 4m13s.

(venv_demucs) $ demucs Downloads/fast_car_48k.wav
Important: the default model was recently changed to `htdemucs` the latest Hybrid Transformer Demucs model. In some cases, this model can actually perform worse than previous models. To get back the old default model use `-n mdx_extra_q`.
Downloading: "https://dl.fbaipublicfiles.com/demucs/hybrid_transformer/955717e8-8726e21a.th" to /Users/guynicholson/.cache/torch/hub/checkpoints/955717e8-8726e21a.th
100%|████████████████████████████████████████████████████████████████████████████████| 80.2M/80.2M [00:01<00:00, 62.8MB/s]
Selected model is a bag of 1 models. You will see that many progress bars per track.
Separated tracks will be stored in /Users/guynicholson/separated/htdemucs
Separating track Downloads/fast_car_48k.wav
100%|██████████████████████████████████████████████| 298.34999999999997/298.34999999999997 [04:13<00:00,  1.18seconds/s]
(venv_demucs) $

An 80MB model file was downloaded to the cache in the above step.

After the run I found four generated WAV files in the separated/htdemucs sub-folder. Each file has 44.1kHz sampling rate. Separated files

I was impressed on my first listen.

  1. All tracks have the expected separation. There are some quiet artifacts such as modulation and bleed-through from other tracks. Reverberation timbre can sound modulated.
  2. The vocals track preserves the singing timbre and artist identity.

Run2 --two-stems=vocals

The --two-stems=vocals option allows separating vocals from the rest of the accompaniment (i.e., "karaoke" mode). vocals can be changed to any source in the selected model. Before running this I renamed the sub-folder generated in run1.

cd
source ./venv_demucs/bin/activate

demucs Downloads/fast_car_48k.wav --two-stems=vocals

The run took 3m49s.

(venv_demucs) $ demucs Downloads/fast_car_48k.wav --two-stems=vocals
Important: the default model was recently changed to `htdemucs` the latest Hybrid Transformer Demucs model. In some cases, this model can actually perform worse than previous models. To get back the old default model use `-n mdx_extra_q`.
Selected model is a bag of 1 models. You will see that many progress bars per track.
Separated tracks will be stored in /Users/guynicholson/separated/htdemucs
Separating track Downloads/fast_car_48k.wav
100%|██████████████████████████████████████████████| 298.34999999999997/298.34999999999997 [03:49<00:00,  1.30seconds/s]
(venv_demucs) $

This generated two WAV files. Separated karaoke files

I was impressed on my first listen.

  1. All tracks have the expected separation.
  2. The no_vocals track has a very quiet slightly ghostly sounding vocal.
  3. The vocals track sounded the same as in run1.

System load

MacOS CPU

I ran these on MacBook Pro 2020 (Intel) with quad-core i5 and 16GB RAM. The python3.12 process used all four CPU cores up to ~380% and memory usage was in range [1, 1.3] GB.

Ubuntu GPU

Running the same separation task on workstation GPU (NVidia RTX A2000 Ampere) took just 13 seconds or >20x real-time. GPU memory usage for the Python 3.10 process was ~900MB.

Next steps

  • Try demucs on Ubuntu 22.04.5 LTS. It generates the same four WAV files.
  • Try htdemucs_ft model.
  • Dataset labelling using Python scripting on GPU.
  • Can the model support streaming audio in chunks.

About

No code repo with installation steps.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published