A research module to test various light to heavy pretrained and made-from-scratch CNN models to predict emotions from spectrogram images.
| Stack | Tech |
|---|---|
| Language | |
| Frameworks | |
| Data Processing | |
| Visualization |
Emoti-Spectro/
│
├── metrics/ # Directory for storing model metrics
├── training_logs/ # Logs from model training sessions
├── training_plots/ # Graphs and plots of training history
│
├── 00_prep.py # Initial data preparation script
├── 01_convertor.py # Audio format conversion utility
├── 02_melspectro.py # Generates Mel-spectrograms from audio
│
├── 03_lightcnn.py # Light custom CNN model implementation
├── 04_deepcnn.py # Deeper custom CNN model implementation
├── 05_mobilenetv2.py # MobileNetV2 transfer learning script
├── 06_mobilenetv2_fine.py # Fine-tuning MobileNetV2
├── 07_deepcnn_rgb.py # Deep CNN for RGB spectrograms
├── 08_lightcnn_rgb.py # Light CNN for RGB spectrograms
├── 09_effnetv2.py # EfficientNetV2 transfer learning
├── 10_effnetv2_ft.py # Fine-tuning EfficientNetV2
├── 11_grucnn.py # Hybrid GRU-CNN model
├── 12_cnngru_effnetb0_ft.py # Complex hybrid architecture
│
├── graphmaker.py # Utility to create performance graphs
├── result_calc.py # script to calculate and display results
└── requirements.txt # Project dependencies
Ensure you have Python installed. Create a virtual environment and install dependencies.
python -m venv venv
# Windows
venv\Scripts\activate
# Linux/Mac
source venv/bin/activate
pip install -r requirements.txtRun the preparation scripts in order to process your audio dataset into spectrograms ready for training.
python 00_prep.py
python 01_convertor.py
python 02_melspectro.pySelect a model script to start training. For example, to train the Light CNN model:
python 03_lightcnn.pyOr to use a pretrained model like MobileNetV2:
python 05_mobilenetv2.pyAfter training, check the training_plots/ directory for accuracy/loss graphs, or run the result calculator:
python result_calc.py