GitHub - miltiadiss/Classification-of-abnormal-respiratory-sounds-using-supervised-domain-adaptation: My Integrated Master of Science Thesis for CEID, University of Patras with Topic: "Supervised domain adaptation techniques for the classification of abnormal respiratory sounds".

Supervised domain adaptation techniques for the classification of abnormal respiratory sounds 🩺

Master of Science Thesis — CEID, University of Patras

Introduction

This repository contains the implementation and artifacts of my MSc thesis on “Supervised domain adaptation techniques for the classification of abnormal respiratory sounds.” The thesis aims to address the common issue of domain shift between different recording devices of respiratory sounds and to develop models that generalize better to unseen devices. The goal is to improve the classification of pathological respiratory sounds (crackles, wheezes) across domains by leveraging supervised domain adaptation methods.

Motivation & Background

Respiratory sound classification is a growing field in medical signal processing and can aid in non-invasive diagnosis of lung disorders.
However, models trained on recordings coming from primarily one specific device often perform poorly when tested on another due to differences in recording conditions, sensor types, subject populations, etc.
Domain adaptation methods aim to reduce this gap by aligning feature distributions across source and target domains (devices).
This work explores supervised domain adaptation to improve cross-domain classification of abnormal respiratory sounds.

Objectives

Investigate supervised domain adaptation algorithms applicable to audio classification
Apply these techniques to respiratory sound datasets
Compare baseline models vs domain-adapted models across domain shifts
Analyze strengths, limitations and generalization potential

Key research questions include:

Which domain adaptation methods yield improved classification performance in cross-domain respiratory sound tasks?
How robust are the models when domain discrepancies are large?
What are the trade-offs among different adaptation approaches?

Methodology & Architecture

The methodology pipeline comprises the following stages:

Preprocessing & dataset augmentation
Feature extraction & selection
Supervised domain adaptation
Classifier training
Classifier evaluation across different devices

Below is a simplified architecture diagram:

In essence, the domain adaptation models are based on adversarial training architecture, which forces them to learn embeddings where the different device distributions are aligned, while maintaining class-discriminative power. Various adversarial methods were implemented and compared, including:

Domain Adversarial Neural Network (DANN)
Conditional Domain Adversarial Network (CDAN)
Domain Adversarial Neural Network with Variational Autoencoder (DANN with VAE)

Experiments & Evaluation

Conducted experiments across Respiratory Sound Database (RSDB): https://www.kaggle.com/datasets/vbookshelf/respiratory-sound-database
Compared baseline classifiers (k-NN, SVM, Random Forest, XGBoost) performance before and after the implementation of supervised domain adaptation
Metrics: Accuracy, Weighted F1-score, Macro AUC for total evaluation & Confusion matrices, ROC curves, Precision-Recall curves, Sensitivity, Specificity, F1-Score, MCC for class-wise evaluation

Best classifier: Non-Linear SVM with C=10.0, γ='scale'

Best domain adaptation method: CDAN with λ=0.2

Method	Accuracy	Macro AUC	Weighted F1-score
Baseline	0.68	0.78	0.66
DANN	0.71 (+4.4%)	0.79	0.69
CDAN	0.77 (+13.2%)	0.86	0.77
DANN with VAE (joint training)	0.74 (+8.8%)	0.85	0.74
DANN with VAE (sequential training)	0.74 (+8.8%)	0.84	0.73

Class	Sensitivity	Specificity	F1-score	MCC
Normal	0.83	0.73	0.82	0.78
Crackle	0.71	0.89	0.72	0.61
Wheeze	0.65	0.96	0.68	0.80

Feature distribution across different classes and devices before and after the implementation of CDAN:

Confusion matrix, ROC curve, Precision-Recall curve of Non-Linear SVM before and after the implementation of CDAN:

Repository structure

/ ├── README.md
├── config.yaml
├── .gitignore
├── LICENSE.txt
├── requirements.txt

├── domain_adaptation_algorithms/
│ ├── dann.py
│ ├── cdan.py
│ ├── davae.py
│ └── spectrum_correction.py

├── modules/
│ ├── preprocessing.ipynb
│ ├── feature_extraction.ipynb
│ ├── domain_adaptation.ipynb
│ └── classification.ipynb

├── RSDB/
│ └── RSDB_analysis.ipynb

├── statistical_models/
│ ├── evaluate.py
│ ├── knn.py
│ ├── non_linear_svm.py
│ ├── random_forest.py
│ └── xgboost.py

├── utils/
│ ├── audio_preprocessing.py
│ ├── metrics.py
│ ├── neural_networks.py
│ └── plots.py

└── Documentation/

Implementation

Languages / Tools: Python, Jupyter Notebooks
Key Modules / Packages: adaptation methods inside domain_adaptation_algorithms/, pipeline modules in modules/, baseline and statistical models in statistical_models/, helper functions in utils/
Configuration: config.yaml holds settings (paths, hyperparameters, domain adaptation choices)
Database: RSDB_analysis.ipynb for exploratory analysis of the database
Report and results: stored under Documentation/ folder

Dependencies

The project requires Python 3.8+ and the following packages:

Core libraries: numpy, pandas, scipy, pyyaml
Machine Learning / Deep Learning: scikit-learn, torch, torchvision, xgboost
Audio processing: librosa
Visualization: matplotlib, seaborn
Notebooks: jupyter, notebook

Usage Instructions

To run the project:

git clone https://github.com/miltiadiss/CEID-MSc-Thesis.git
cd CEID-MSc-Thesis

# (Optional) create virtual environment
python3 -m venv venv
source venv/bin/activate   # Linux / macOS
venv\Scripts\activate      # Windows PowerShell

# Install dependencies
pip install -r requirements.txt

# Example: train and evaluate a domain adaptation model
python domain_adaptation_algorithms/dann.py --config config.yaml

# Example: run a statistical model
python statistical_models/random_forest.py

# Or explore module-specific notebooks
jupyter notebook modules/preprocessing.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Supervised domain adaptation techniques for the classification of abnormal respiratory sounds 🩺

Table of Contents

Introduction

Motivation & Background

Objectives

Methodology & Architecture

Experiments & Evaluation

Repository structure

Implementation

Dependencies

Usage Instructions

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 175 Commits
Documentation		Documentation
RSDB		RSDB
domain_adaptation_algorithms		domain_adaptation_algorithms
modules		modules
statistical_models		statistical_models
utils		utils
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
config.yaml		config.yaml
requirements.txt		requirements.txt

License

miltiadiss/Classification-of-abnormal-respiratory-sounds-using-supervised-domain-adaptation

Folders and files

Latest commit

History

Repository files navigation

Supervised domain adaptation techniques for the classification of abnormal respiratory sounds 🩺

Table of Contents

Introduction

Motivation & Background

Objectives

Methodology & Architecture

Experiments & Evaluation

Repository structure

Implementation

Dependencies

Usage Instructions

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages