Skip to content

saydek217/Stereo-Classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Stereo Image Classification

Project Overview

This project implements a deep learning model for classifying stereo images (9×9 pixels) into two classes: 0 and 1.
The model is a convolutional neural network (CNN) that processes 2-channel images:

  • Channel 1: Grayscale image
  • Channel 2: Disparity map (NA values encoded as 255)

Quick Start

Prerequisites

Install the required Python packages:

pip install torch torchvision numpy matplotlib scikit-learn seaborn tqdm pandas Pillow

Train the Model (Main Experiment)

python train.py --data_path ./img --batch_size 64 --lr 0.0005 --epochs 20 --experiment_name 20_epochs_test

To run with default parameters:

python train.py

Project Structure

project/
├── train.py                  # Main training script with CLI
├── img/                      # Dataset folder (PNG images)
│   ├── label_0_vol_001_00000050_3416.png
│   ├── label_1_vol_001_00000100_2713.png
│   └── ...
├── logs/                     # Training logs and results
│   └── 20_epochs_test/       # Example experiment
│       ├── images/           # Generated graphs and plots
│       │   ├── training_curves.png
│       │   ├── confusion_matrix.png
│       │   ├── roc_curve.png
│       │   ├── precision_recall_curve.png
│       │   ├── class_metrics.png
│       │   └── inference_grid_*.png
│       ├── metrics/          # Numerical results
│       │   ├── training_metrics.json
│       │   ├── training_history.csv
│       │   ├── evaluation_metrics.json
│       │   ├── classification_report.csv
│       │   └── confusion_matrix.csv
│       ├── inference/        # Prediction visualizations
│       ├── training_config.json
│       ├── summary_report.json
│       └── stereo_classifier_final.pth
└── utils/                    # Modular code organization
    ├── dataset.py            # StereoImageDataset class
    ├── model.py              # StereoClassifier CNN architecture
    ├── utils.py              # Training / evaluation / logging helpers
    ├── train.py              # Training loop utilities
    └── inference.py          # Inference and visualization functions

Model Architecture: StereoClassifier

Input: 2-channel, 9×9 pixel images

Convolutional backbone:

  • Conv2d(2 → 16, kernel_size=3, padding=1) + ReLU + BatchNorm + MaxPool2d(2)
  • Conv2d(16 → 32, kernel_size=3, padding=1) + ReLU + BatchNorm + MaxPool2d(2)
  • Conv2d(32 → 64, kernel_size=3, padding=1) + ReLU + BatchNorm

Classifier head:

  • Flatten
  • Fully connected: 256 → 64 → 32 → 2 (2 output classes)
  • Dropout: p = 0.3 (first FC layer), p = 0.2 (second FC layer)

Data Preprocessing

Channel 1 — Grayscale

  • Pixel values normalized to [0, 1].

Channel 2 — Disparity

  • NA values encoded as 255.
  • NA values remapped to 0.
  • Pixel values then normalized to [0, 1].

Data Augmentation

  • Random horizontal flip (probability = 0.5)
  • Random rotation in the range ±5°

Training Configuration

Main experiment command:

python train.py --data_path ./img --batch_size 64 --lr 0.0005 --epochs 20 --experiment_name 20_epochs_test

Hyperparameters

Parameter Value Description
Batch size 64 Training batch size
Learning rate 0.0005 Adam optimizer LR
Epochs 20 Total training epochs
Weight decay 1e-4 L2 regularization
Early stopping 10 Patience (epochs)
Train / Val / Test 70 / 15 / 15 Data split (percent)

Training Strategy

  • Optimizer: Adam (with weight decay)
  • Loss: CrossEntropyLoss
  • LR scheduling: ReduceLROnPlateau
  • Validation: Per epoch with detailed metrics
  • Checkpointing: Best model saved as stereo_classifier_final.pth

Results Summary

Training Behavior

  • Convergence: Rapid convergence within the first ~5 epochs
  • Stability: Minimal oscillation in loss and accuracy
  • Generalization: Validation performance comparable to or better than training (good sign)
  • Efficiency: Training completes in under ~5 minutes on GPU (for the main experiment setup)

Usage Examples

Basic Training

# Use default arguments
python train.py

Custom Training

python train.py     --data_path ./img     --batch_size 32     --lr 0.001     --epochs 50     --experiment_name custom_run

Additional Options

python train.py     --data_path ./img     --batch_size 64     --lr 0.0005     --epochs 20     --experiment_name my_experiment     --no_cuda   # Force CPU-only training

Key Output Files to Inspect

High-Level Overview

  • logs/20_epochs_test/summary_report.json — Executive summary of the experiment
  • logs/20_epochs_test/training_config.json — Full configuration used for training

Detailed Analysis

  • logs/20_epochs_test/images/training_curves.png — Loss and accuracy over epochs
  • logs/20_epochs_test/images/confusion_matrix.png — Classification error structure
  • logs/20_epochs_test/images/roc_curve.png — ROC curve
  • logs/20_epochs_test/images/precision_recall_curve.png — Precision–Recall profile
  • logs/20_epochs_test/images/inference_grid_1.png — Sample predictions
  • logs/20_epochs_test/metrics/evaluation_metrics.json — Global performance metrics
  • logs/20_epochs_test/metrics/classification_report.csv — Per-class metrics
  • logs/20_epochs_test/metrics/confusion_matrix.csv — Numerical confusion matrix

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages