A Python library for audio feature extraction, classification, segmentation and applications

This is general info. Click here for the complete wiki and here for a more generic intro to audio data handling

News

[2022-01-01] If you are not interested in training audio models from your own data, you can check the Deep Audio API, were you can directly send audio data and receive predictions with regards to the respective audio content (speech vs silence, musical genre, speaker gender, etc).
[2021-08-06] deep-audio-features deep audio classification and feature extraction using CNNs and Pytorch
Check out paura a Python script for realtime recording and analysis of audio data

General

pyAudioAnalysis is a Python library covering a wide range of audio analysis tasks. Through pyAudioAnalysis you can:

Extract audio features and representations (e.g. mfccs, spectrogram, chromagram)
Train, parameter tune and evaluate classifiers of audio segments
Classify unknown sounds
Detect audio events and exclude silence periods from long recordings
Perform supervised segmentation (joint segmentation - classification)
Perform unsupervised segmentation (e.g. speaker diarization) and extract audio thumbnails
Train and use audio regression models (example application: emotion recognition)
Apply dimensionality reduction to visualize audio data and content similarities

An audio classification example

More examples and detailed tutorials can be found at the wiki

pyAudioAnalysis provides easy-to-call wrappers to execute audio analysis tasks. Eg, this code first trains an audio segment classifier, given a set of WAV files stored in folders (each folder representing a different class) and then the trained classifier is used to classify an unknown audio WAV file

from pyAudioAnalysis import audioTrainTest as aT
aT.extract_features_and_train(["classifierData/music","classifierData/speech"], 1.0, 1.0, aT.shortTermWindow, aT.shortTermStep, "svm", "svmSMtemp", False)
aT.file_classification("data/doremi.wav", "svmSMtemp","svm")

Result: (0.0, array([ 0.90156761, 0.09843239]), ['music', 'speech'])

In addition, command-line support is provided for all functionalities. E.g. the following command extracts the spectrogram of an audio signal stored in a WAV file: python audioAnalysis.py fileSpectrogram -i data/doremi.wav

Author

Theodoros Giannakopoulos, Principal Researcher of Multimodal Machine Learning at the Multimedia Analysis Group of the Computational Intelligence Lab (MagCIL) of the Institute of Informatics and Telecommunications, of the National Center for Scientific Research "Demokritos"

Installation Steps

Clone the repository:

git clone https://github.com/tyiannak/pyAudioAnalysis.git
cd pyAudioAnalysis

Create and activate a virtual environment (recommended):

python3 -m venv venv
# On Windows:
venv\Scripts\activate
# On Unix or MacOS:
source venv/bin/activate

Install dependencies and package:

python -m pip install --upgrade pip
pip install -r requirements.txt
pip install -e .

Using the UI Interface

pyAudioAnalysis now includes a web-based UI interface for easy access to all functionality:

Starting the UI

Ensure you're in your virtual environment:

# On Windows:
venv\Scripts\activate
# On Unix or MacOS:
source venv/bin/activate

Launch the UI:

PYTHONPATH=$(pwd) streamlit run pyAudioAnalysis/audioUI.py

Your default web browser will automatically open to the UI (typically http://localhost:8501)

UI Features

The interface provides easy access to:

Audio Classification: Upload and classify audio files using pre-trained models
Feature Extraction: Visualize audio features like MFCCs, spectrograms
Beat Extraction: Analyze rhythm and tempo
Segmentation: Perform audio segmentation tasks
Regression: Train and use regression models

Usage Tips

Audio files must be in WAV format
For MP3 files, convert to WAV first using FFmpeg:

ffmpeg -i input.mp3 output.wav

Models should be trained first using the command line interface before using them in the UI
Large audio files may take longer to process - consider splitting them into smaller segments

Running Tests

Basic Testing Commands

# Run all tests
python -m pytest tests/

# Run tests with coverage report
python -m pytest --cov=pyAudioAnalysis tests/

# Run specific test file
python -m pytest tests/test_standard.py

# Run tests verbosely
python -m pytest -v tests/

# Run tests matching specific pattern
python -m pytest -k "test_feature" tests/

Test Organization

Test files are organized by functionality:

test_standard.py: Core functionality tests (previously in shell scripts)
test_audio_utils.py: Utility function tests
test_ui.py: Streamlit UI tests

Coverage Reports

# Generate HTML coverage report
python -m pytest --cov=pyAudioAnalysis --cov-report=html tests/

# Generate both coverage and test reports
python -m pytest tests/ --cov=pyAudioAnalysis --cov-report=html --html=tests/test-report.html

The HTML coverage report will be available in the htmlcov directory, and the test report will be in tests/test-report.html.

Continuous Integration

The project uses GitHub Actions for continuous integration, running:

All tests with coverage reporting
Code style checks
System dependency verification
Multiple Python version testing

Test reports and coverage information are automatically uploaded as artifacts and to Codecov.

Troubleshooting Tests

If you encounter any issues:

Ensure all dependencies are installed:

pip install -r requirements.txt

Verify system dependencies:

# Ubuntu/Debian
sudo apt-get install ffmpeg libavcodec-extra

# MacOS
brew install ffmpeg

Check that you're in the correct directory and virtual environment is activated
If you get audio-related errors, ensure your system's audio drivers are properly configured

Command Line Usage

For those who prefer command line usage, all features are still available through the CLI:

from pyAudioAnalysis import audioTrainTest as aT
# Train a classifier
aT.extract_features_and_train(["classifierData/music","classifierData/speech"], 1.0, 1.0, aT.shortTermWindow, aT.shortTermStep, "svm", "svmSMtemp", False)
# Classify a file
aT.file_classification("data/doremi.wav", "svmSMtemp","svm")

Model Organization for UI

The UI looks for pre-trained models in the data/models directory. To use the classification features in the UI:

Create a models directory structure:

pyAudioAnalysis/
├── data/
│   └── models/
│       ├── svm_rbf_sm.svm         # Speech/Music classifier
│       ├── svm_rbf_4class.svm     # 4-class audio classifier
│       ├── knn_speaker_10.knn     # 10-speaker recognition
│       ├── svm_rbf_movie8.svm     # Movie genre classifier
│       └── svm_rbf_musical_genre_6.svm  # Music genre classifier

Model Naming Convention:

Format: {classifier_type}_{task_name}.{ext}
Classifier types: svm_rbf or knn
Extensions: .svm for SVM models, .knn for KNN models

Default Model Search Paths:

./data/models
../data/models
{package_directory}/data/models

If you're using pre-trained models, place them in one of these locations. The UI will automatically detect and list available models in the classification section.

Note: You can train your own models using the command line interface and place them in the models directory for use in the UI.

Name		Name	Last commit message	Last commit date
Latest commit History 780 Commits
.github		.github
pyAudioAnalysis		pyAudioAnalysis
pytests		pytests
tests		tests
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
icon.png		icon.png
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

A Python library for audio feature extraction, classification, segmentation and applications

News

General

An audio classification example

Further reading

Author

Installation Steps

Using the UI Interface

Starting the UI

UI Features

Usage Tips

Running Tests

Basic Testing Commands

Test Organization

Coverage Reports

Continuous Integration

Troubleshooting Tests

Command Line Usage

Model Organization for UI

About

Uh oh!

Releases

Packages

Languages

License

osecen/pyAudioAnalysis

Folders and files

Latest commit

History

Repository files navigation

A Python library for audio feature extraction, classification, segmentation and applications

News

General

An audio classification example

Further reading

Author

Installation Steps

Using the UI Interface

Starting the UI

UI Features

Usage Tips

Running Tests

Basic Testing Commands

Test Organization

Coverage Reports

Continuous Integration

Troubleshooting Tests

Command Line Usage

Model Organization for UI

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages