Skip to content

Shiv0087/MINI-PRO

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Visual Question Answering (VQA) System

Python PyTorch Flask Status

📌 Overview

This project implements a state-of-the-art Visual Question Answering (VQA) system using PyTorch. The system is designed to take an image and a natural language question about the image as input, and output an accurate answer. It leverages advanced deep learning paradigms, integrating computer vision architectures (like ResNet18 and DenseNet121) with Natural Language Processing models (like BioBERT) via attention mechanisms.

Additionally, this repository includes a complete web-based user interface (app.py), comprehensive training and evaluation pipelines, and model explainability features via Grad-CAM.

✨ Features

  • Multi-Modal Architecture: Combines CNNs for image feature extraction and Transformer-based models for question encoding.
  • Advanced Attention: Utilizes Cross-Attention and CBAM (Convolutional Block Attention Module) for enhanced feature fusion.
  • Explainable AI (XAI): Integrated Grad-CAM (gradcam.py) to visualize where the model "looks" when answering a question.
  • Web Interface: Easy-to-use frontend built with HTML/CSS (templates/index.html) served backend API (app.py).
  • Comprehensive Evaluation: Built-in scripts (evaluate_vqa.py, ieee_charts.py) for precision, recall, accuracy analysis, and generating paper-ready charts.

📂 Project Structure

MINI-PRO/
├── app.py                      # Web application entry point
├── templates/index.html        # Frontend UI for the web app
├── dataset.py                  # Dataloaders and Dataset class definitions
├── model*.py                   # Multiple VQA model architecture versions
├── train_*.py                  # Training scripts (baseline, resnet18, vqa_v3)
├── evaluate_*.py               # Evaluation metrics and reporting
├── gradcam.py                  # Grad-CAM visualization generator
├── generate_plots.py           # Evaluation chart plotting script (Matplotlib/Seaborn)
└── *.json                      # Training, validation, and testing dataset files

⚙️ Installation

  1. Clone this repository:
git clone https://github.com/Shiv0087/MINI-PRO.git
cd MINI-PRO
  1. Create a virtual environment (recommended):
python -m venv venv
# On Windows:
venv\Scripts\activate
# On Mac/Linux:
source venv/bin/activate
  1. Install the required dependencies:
pip install torch torchvision numpy pandas matplotlib seaborn scikit-learn transformers
``` *(Note: Adjust dependencies based on your specific setup)*

## 🚀 Usage

### 1. Web Application
To run the interactive web interface, start the application server:
```bash
python app.py

Then navigate to http://localhost:5000 (or the provided port) in your web browser.

2. Training the Model

To start training from scratch, run the desired training script. For example:

python train_vqa.py

3. Evaluation & Visualization

To evaluate a trained model and generate reports:

python evaluate_vqa.py
python generate_plots.py

📊 Evaluation & Explainability

This project goes beyond mere metrics by including detailed visual reporting:

  • ieee_charts.py: Generates publication-ready evaluation charts.
  • generate_good_gradcam.py: Outputs Grad-CAM heatmaps showing the precise visual regions the model activated upon when generating its answer snippet.

📝 License

This project is open-source. Feel free to use, modify, and distribute it as needed.


Developed by Shivraj

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors