Skip to content

Cheekurthi-Vamsi/PCB-Defect-Detection-and-Classification-System

Repository files navigation


πŸ”¬ PCB Defect Detection & Classification System

Automated Optical Inspection powered by Computer Vision and Deep Learning

Overview β€’ Motivation β€’ Architecture β€’ How It Works β€’ Model β€’ Web App β€’ Installation β€’ Usage β€’ Results β€’ Structure β€’ Docs


πŸ“Œ Overview

The PCB Defect Detection & Classification System is a production-grade, end-to-end automated optical inspection (AOI) solution built to identify and classify manufacturing defects on Printed Circuit Boards with computer vision accuracy and deep learning intelligence.

The system processes a single PCB image through a five-stage pipeline β€” preprocessing, defect localization, region-of-interest extraction, neural network classification, and result annotation β€” delivering fully labeled outputs with bounding boxes, confidence scores, and downloadable reports in under 3 seconds on standard CPU hardware.

It is built on the DeepPCB dataset and uses a fine-tuned EfficientNetB0 convolutional neural network to classify defects into six industry-standard categories, targeting a classification accuracy of β‰₯ 95% on the held-out test set.

Live Repository: https://github.com/Cheekurthi-Vamsi/PCB-Defect-Detection-and-Classification-System


πŸ’‘ Motivation

Printed Circuit Boards are the backbone of every modern electronic device β€” from smartphones and laptops to medical equipment and aerospace systems. A single manufacturing defect that goes undetected can cause:

  • Catastrophic field failures β€” devices that malfunction after shipment
  • Safety-critical hazards β€” particularly in automotive, aviation, and medical sectors
  • Financial losses β€” recalls, warranty replacements, and brand damage
  • Production waste β€” boards scrapped late in the assembly cycle are far more expensive to replace than those caught at the bare-board stage

The Gap in Existing AOI Solutions

Traditional Automated Optical Inspection systems are widely deployed on manufacturing lines, but they come with serious limitations:

Approach Core Limitation
Rule-based AOI Requires hand-tuned thresholds for every board layout; breaks under lighting variation
Template matching Fails when the reference template is unavailable or when boards warp slightly
Manual inspection Operator fatigue causes detection rates to drop below 70% after extended shifts
Simple thresholding Produces high false-positive rates and cannot identify what type of defect was found

The Solution

This project addresses these gaps by combining the robustness of classical image processing with the representational power of transfer learning. The result is a system that:

  • Works from a single image β€” no reference template database needed
  • Produces class-level labels β€” not just "defect found", but exactly what kind
  • Runs in real time β€” under 3 seconds per image on CPU
  • Provides explainable output β€” annotated images, confidence scores, and exportable logs
  • Is accessible via a browser β€” no command-line expertise needed to operate

πŸ— System Architecture

The system is organized as a linear pipeline with five clearly separated processing stages. Each stage is implemented as an independent, testable Python function in backend/inference.py, making individual components reusable and easy to validate in isolation.

╔══════════════════════════════════════════════════════════════════════╗
β•‘                        INPUT LAYER                                   β•‘
β•‘          Raw PCB image (JPG / PNG / BMP / TIF, any resolution)       β•‘
β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•¦β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•
                           β•‘
           ╔═══════════════▼═══════════════╗
           β•‘       STAGE 1: PREPROCESSING   β•‘
           β•‘  β€’ BGR decode from bytes        β•‘
           β•‘  β€’ Convert to grayscale         β•‘
           β•‘  β€’ 5Γ—5 median blur (denoise)    β•‘
           β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•¦β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•
                           β•‘
          ╔════════════════╩════════════════╗
          β•‘                                 β•‘
╔═════════▼══════════╗           ╔══════════▼═══════════╗
β•‘  ADAPTIVE THRESH   β•‘           β•‘   CANNY EDGE DETECT  β•‘
β•‘  blockSize=31, C=8 β•‘           β•‘   low=40, high=160   β•‘
β•‘  Local anomalies   β•‘           β•‘   Structural edges   β•‘
β•šβ•β•β•β•β•β•β•β•β•β•¦β•β•β•β•β•β•β•β•β•β•β•           β•šβ•β•β•β•β•β•β•β•β•β•β•¦β•β•β•β•β•β•β•β•β•β•β•β•
          β•‘     Bitwise OR (fusion)          β•‘
          β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•¦β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•
                           β•‘
           ╔═══════════════▼═══════════════╗
           β•‘   STAGE 2: DEFECT MASK         β•‘
           β•‘  β€’ Morph Close (5Γ—5 kernel)    β•‘
           β•‘  β€’ Morph Open  (3Γ—3 kernel)    β•‘
           β•‘  β†’ Binary defect mask (0/255)  β•‘
           β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•¦β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•
                           β•‘
           ╔═══════════════▼═══════════════╗
           β•‘   STAGE 3: ROI EXTRACTION      β•‘
           β•‘  β€’ findContours (EXTERNAL)     β•‘
           β•‘  β€’ Filter: area, aspect ratio  β•‘
           β•‘  β€’ Compute centroid (cx, cy)   β•‘
           β•‘  β€’ Crop 128Γ—128 patch per ROI  β•‘
           β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•¦β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•
                           β•‘
           ╔═══════════════▼═══════════════╗
           β•‘   STAGE 4: CLASSIFICATION      β•‘
           β•‘  β€’ Normalize patch [0,1]       β•‘
           β•‘  β€’ EfficientNetB0 forward pass β•‘
           β•‘  β€’ Softmax β†’ 6-class probs     β•‘
           β•‘  β€’ argmax β†’ class + confidence β•‘
           β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•¦β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•
                           β•‘
           ╔═══════════════▼═══════════════╗
           β•‘   STAGE 5: ANNOTATION & OUT    β•‘
           β•‘  β€’ Colored bounding boxes      β•‘
           β•‘  β€’ Class label + confidence %  β•‘
           β•‘  β€’ PNG / CSV / TXT export      β•‘
           β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•

Two Operating Modes

The backend supports two inference modes, both exposed through the same modular function library:

Mode Function Use Case Template Required
Single-Image run_inference_single() Web app / production ❌ No
Dual-Image (Legacy) run_inference() Paired training data evaluation βœ… Yes

The single-image mode uses adaptive thresholding + Canny edges for localization. The dual-image mode uses ORB-based alignment + absolute difference + Otsu's thresholding β€” the classic template-subtraction approach used during training data preparation.


βš™οΈ How It Works

Step 1 β€” Intelligent Defect Localization

Before classification can happen, the system must pinpoint where on the PCB defects are located. This is a segmentation problem, and it is solved without any reference image.

Adaptive Thresholding applies a Gaussian-weighted local threshold across a sliding 31Γ—31 pixel window. Rather than comparing each pixel to a single global threshold, it compares each pixel to the weighted mean of its local neighborhood. This makes the system resilient to:

  • Non-uniform illumination across the PCB surface
  • Varying board colors and coating finishes
  • Slight color differences between board batches

Pixels whose intensity deviates significantly from their local context are flagged as anomalous and included in the defect map.

Canny Edge Detection runs in parallel on the same preprocessed image, identifying sharp intensity gradients that correspond to structural boundaries. Defects like open circuits (broken traces) and shorts (trace bridges) produce distinctly sharp edges that adaptive thresholding alone may miss.

Both maps are merged using a bitwise OR β€” any pixel flagged by either method is included in the combined candidate mask. Morphological close (fills small holes within blobs) and open (removes isolated single-pixel noise) operations then clean the mask.

Step 2 β€” Region of Interest Extraction

cv2.findContours retrieves the external boundaries of every white blob in the cleaned binary mask. Each contour represents a candidate defect region.

Quality filters remove false-positive regions:

Filter Value Removes
Minimum area 80 pxΒ² Sub-pixel noise and camera artefacts
Maximum area 50,000 pxΒ² Whole-board regions and large background blobs
Aspect ratio (training only) ≀ 3.5 Elongated trace lines misidentified as defects
Circularity (Missing Hole only) β‰₯ 0.45 Non-circular regions that cannot be drill holes

For each accepted contour, the centroid (cx, cy) is computed from image moments, and a 128 Γ— 128 pixel patch is cropped from the original image, centred on the centroid. This patch size was chosen to be large enough to include the defect's structural context while small enough to match the EfficientNetB0 training resolution.

Step 3 β€” Neural Network Classification

Each 128Γ—128 RGB patch is normalized to the [0.0, 1.0] float range and passed through the EfficientNetB0 model. The final softmax layer outputs a 6-element probability vector, one value per defect class. The argmax of this vector gives the predicted class, and its value gives the confidence score.

The model was trained with a two-phase transfer learning strategy (see Deep Learning Model) and achieves robust generalization across board layouts unseen during training.

Step 4 β€” Annotation

The original PCB image is annotated with:

  • A colored rectangular bounding box around each defect region (color unique per class)
  • A filled label bar above each box showing the class name and confidence percentage
  • All drawing operations use anti-aliased text rendering for clean output at any resolution

Step 5 β€” Results & Export

Results are returned as a Python dict containing NumPy RGB image arrays and a prediction list. The Streamlit frontend renders these into the four-panel visual output and provides four downloadable files.


🧠 Deep Learning Model

Architecture: EfficientNetB0 + Custom Classification Head

EfficientNetB0 was selected as the backbone for three reasons:

  1. Efficiency: ~4.2M parameters vs. 23M for ResNet50 and 138M for VGG16 β€” significantly faster inference on CPU
  2. Accuracy: Compound scaling (simultaneous depth + width + resolution scaling) delivers state-of-the-art ImageNet accuracy at low parameter cost
  3. Transferability: ImageNet pre-training provides strong low-level feature extractors (edge detectors, texture filters) that directly benefit PCB defect recognition
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Input Layer  β”‚  128 Γ— 128 Γ— 3 (RGB)            β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  EfficientNetB0 Backbone                        β”‚
β”‚  (ImageNet weights, include_top=False)          β”‚
β”‚  Output: 4 Γ— 4 Γ— 1280 feature maps             β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  GlobalAveragePooling2D  β†’ 1280-d vector        β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  BatchNormalization                             β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Dropout  (rate = 0.4)                          β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Dense  (256 units, ReLU activation)            β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  BatchNormalization                             β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Dropout  (rate = 0.3)                          β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Dense  (6 units, Softmax activation)           β”‚
β”‚  β†’ [p_missing_hole, p_mouse_bite, ...]          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
  Total trainable params (fine-tune): ~4.2 M

Two-Phase Training Strategy

A naive approach of training the entire network from scratch would either overfit (too few samples) or take very long to converge. Transfer learning solves both problems, and the two-phase strategy prevents catastrophic forgetting:

Phase 1 β€” Warm-Up (backbone completely frozen)

All EfficientNetB0 weights are frozen. Only the custom classification head is trained. This lets the head quickly adapt its weights toward the defect classification task before being exposed to backbone gradients.

Parameter Value
Maximum epochs 10
Adam learning rate 1 Γ— 10⁻³
Early stopping patience 5 epochs (monitors val_accuracy)
LR reduction On val_loss plateau, factor 0.5, patience 3

Phase 2 β€” Fine-Tuning (top 30 backbone layers unfrozen)

The top 30 layers of EfficientNetB0 are unfrozen and trained jointly with the head. A smaller learning rate (1/10th of Phase 1) preserves the low-level ImageNet features in the lower backbone layers while allowing the high-level feature extractors to specialize for PCB defect patterns.

Parameter Value
Maximum epochs 20
Adam learning rate 1 Γ— 10⁻⁴
Early stopping patience 8 epochs (monitors val_accuracy)
LR reduction On val_loss plateau, factor 0.5, patience 4

Training Callbacks

Callback Configuration Purpose
ModelCheckpoint save_best_only=True, monitors val_accuracy Saves peak-performance checkpoint
EarlyStopping restore_best_weights=True Prevents overfitting past peak
ReduceLROnPlateau Factor 0.5, min_lr=1Γ—10⁻⁷ Adapts LR to loss landscape
CSVLogger Appended across both phases Full epoch-by-epoch audit trail

Data Augmentation

Applied only to the training split to simulate the variability of real board images:

Transformation Range / Setting
Rotation Β±20Β°
Width shift Β±15%
Height shift Β±15%
Shear 10%
Zoom 15%
Horizontal flip Enabled
Vertical flip Enabled
Brightness [0.8, 1.2]
Fill mode Nearest neighbor

Model vs. Alternatives

Model Parameters ImageNet Top-1 Inference (CPU) Chosen
EfficientNetB0 4.2 M 77.1% Fast βœ…
EfficientNetB3 12 M 81.6% Moderate β€”
ResNet50 23 M 76.0% Moderate β€”
VGG16 138 M 71.3% Slow β€”
MobileNetV2 3.4 M 71.8% Fastest β€”

EfficientNetB0 offers the best balance of accuracy and inference speed for the 128Γ—128 PCB patch classification task.


🎯 Defect Classes

The system detects and classifies six defect types defined by the DeepPCB benchmark:

Class Color Code Visual Description Manufacturing Root Cause
Missing Hole πŸ”΄ #FF0000 A required drill hole is absent or partially formed Drilling machine error, incorrect depth setting
Mouse Bite 🟠 #FF8C00 A partial chamfer or notch along the PCB edge PCB routing damage, handling impact
Open Circuit 🟒 #00CC44 A conductor trace is broken, interrupting the electrical path Over-etching, mechanical stress fracture
Short πŸ”΅ #3399FF Two or more unrelated traces are unintentionally connected Solder bridging, under-etching, contamination
Spur 🟣 #CC44FF A small copper protrusion extending from a trace Under-etching, resist defect
Spurious Copper 🩡 #00CCCC Excess copper deposit outside the intended trace boundary Resist failure, contamination on board surface

🌐 Web Application

The Streamlit web application provides a zero-friction interface for PCB inspection β€” no programming knowledge required to operate.

Interface Layout

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  SIDEBAR                  β”‚  MAIN CANVAS                         β”‚
β”‚  ─────────────────────    β”‚  ──────────────────────────────────  β”‚
β”‚  πŸ”¬ PCB Defect AI         β”‚  ╔════════════════════════════════╗  β”‚
β”‚                           β”‚  β•‘  πŸ”¬ PCB Defect Detection Systemβ•‘  β”‚
β”‚  βš™οΈ Configuration         β”‚  β•‘  Upload a PCB image β€” the AI   β•‘  β”‚
β”‚  [Model Path text input]  β”‚  β•‘  pipeline detects defects...   β•‘  β”‚
β”‚                           β”‚  β•šβ•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•β•  β”‚
β”‚  πŸ“‹ Pipeline Steps        β”‚                                      β”‚
β”‚  β‘  Defect Mask (Adaptive) β”‚  β”Œβ”€β”€β”€ Upload PCB Image ───────────┐  β”‚
β”‚  β‘‘ Contour Detection      β”‚  β”‚   Drag & drop or click here    β”‚  β”‚
β”‚  β‘’ ROI Extraction         β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β”‚  β‘£ Classification (CNN)   β”‚                                      β”‚
β”‚  β‘€ Annotation             β”‚  [πŸš€ Run Defect Detection]           β”‚
β”‚                           β”‚                                      β”‚
β”‚  🏷️ Defect Classes        β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚  ● Missing Hole           β”‚  β”‚ Defects β”‚ Types β”‚ Conf β”‚ Timeβ”‚    β”‚
β”‚  ● Mouse Bite             β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β”‚  ● Open Circuit           β”‚                                      β”‚
β”‚  ● Short                  β”‚  βœ… Inference: 287ms β€” within target β”‚
β”‚  ● Spur                   β”‚                                      β”‚
β”‚  ● Spurious Copper        β”‚  [Original][Mask][Heatmap][Annotated]β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Feature Breakdown

Image Upload

  • Accepts JPG, JPEG, PNG, BMP, TIF formats
  • Displays a pre-analysis preview immediately after upload
  • Processes images of any resolution (internally handled)

Inference Execution

  • Single click triggers the full 5-stage pipeline
  • Spinner with status text during processing
  • Results stored in Streamlit session state β€” persist across re-renders without re-running inference

Metric Dashboard Four summary tiles appear after every successful inference run:

Tile Metric Description
Defects Found Count Total number of regions detected
Defect Types Count Number of distinct defect classes
Avg Confidence % Mean classification probability across detections
Inference Time ms End-to-end pipeline execution time

Performance Gate After every run, the app evaluates whether inference met the ≀ 3,000 ms operational target:

  • βœ… Green banner β€” met the target, displays exact time
  • ⚠️ Warning banner β€” exceeded target, suggests GPU usage or image resizing

4-Panel Visual Output

Panel Content
Original Input The uploaded PCB image as-is
Defect Mask Binary (white=defect, black=clean) segmentation output
Anomaly Heatmap 65/35 alpha blend of original + red defect overlay
Annotated Output Original with colored bounding boxes and class+confidence labels

Prediction Table A styled HTML table with one row per detection:

  • Color-coded defect class badge (matching bounding box color)
  • Confidence percentage with a visual mini progress bar
  • Centroid pixel coordinates (cx, cy)
  • Bounding box coordinates (x, y, width, height)

Class Distribution Chart A bar chart (via st.bar_chart) shows the frequency of each defect class across all detections in the uploaded image β€” useful for quickly identifying dominant defect types on a board.

Export Suite Four individually downloadable files, all generated fully in memory:

Button Filename Format Contents
πŸ“₯ Download Annotated Image pcb_annotated_<ts>.png PNG Original + bounding boxes
πŸ“₯ Download CSV Log pcb_predictions_<ts>.csv CSV All detections tabulated
πŸ“₯ Download Defect Mask pcb_mask_<ts>.png PNG Binary segmentation mask
πŸ“₯ Download Evaluation Report pcb_eval_report_<ts>.txt TXT Performance gate + defect list

πŸ“Š Training Data Pipeline

Dataset β€” DeepPCB

The DeepPCB dataset provides annotated PCB image pairs consisting of:

  • A template image (clean, defect-free reference PCB)
  • A test image (the same board location with a manufactured defect introduced)

Both images in each pair are captured under identical conditions, making them suitable for difference-based localization.

Processing Pipeline (Training Data Preparation)

Stage A β€” Image Alignment

The test image is geometrically aligned to its template using ORB (Oriented FAST and Rotated BRIEF) feature matching and RANSAC-robust homography estimation:

  1. cv2.ORB_create(5000) β€” detects 5,000 keypoints in each image
  2. cv2.BFMatcher with NORM_HAMMING + crossCheck=True β€” finds the best mutual descriptor matches
  3. Top 90% of matches by Hamming distance are retained for robustness
  4. cv2.findHomography(RANSAC, threshold=5.0) β€” estimates the perspective transformation
  5. cv2.warpPerspective β€” warps the test image onto the template's canvas

If alignment fails (< 4 keypoints or no homography), the test image is simply resized to the template dimensions as a fallback.

Stage B β€” Defect Mask Generation

  1. Both images are median-blurred (5Γ—5 kernel) to suppress camera noise
  2. cv2.absdiff computes the pixel-wise absolute difference
  3. The difference is converted to grayscale
  4. Otsu's thresholding automatically determines the optimal binary split from the histogram
  5. Morphological close (5Γ—5) fills gaps, open (3Γ—3) removes noise

Stage C β€” ROI Extraction & Dataset Split

  1. Contours are detected on the binary mask with additional morphological erosion to sharpen borders
  2. Each contour passes quality filters (area, aspect ratio, circularity for Missing Hole)
  3. ROI patches are cropped with 4-pixel context padding
  4. All patches are pooled per class, randomly shuffled (seed=42), and split 70 / 15 / 15% into train / validation / test sets
  5. Splits are saved as labeled image folders for direct use with Keras ImageDataGenerator

Dataset Structure After Processing

module2_output/
└── rois/
    β”œβ”€β”€ train/
    β”‚   β”œβ”€β”€ Missing_hole/   [~70% of Missing Hole ROIs]
    β”‚   β”œβ”€β”€ Mouse_bite/     [~70% of Mouse Bite ROIs]
    β”‚   β”œβ”€β”€ Open_circuit/
    β”‚   β”œβ”€β”€ Short/
    β”‚   β”œβ”€β”€ Spur/
    β”‚   └── Spurious_copper/
    β”œβ”€β”€ val/                [~15% each]
    └── test/               [~15% each]

πŸ“ˆ Evaluation & Metrics

Training Evaluation (on ROI test split)

After training completes, the model is evaluated on the held-out test split:

Metric Description
Test Accuracy Overall fraction of ROIs correctly classified
Precision (weighted) TP / (TP + FP) averaged by class support
Recall (weighted) TP / (TP + FN) averaged by class support
F1-Score (weighted) Harmonic mean of precision and recall
Confusion Matrix 6Γ—6 heatmap β€” rows = true label, cols = predicted
Classification Report Per-class and macro/weighted averages

Target: β‰₯ 95% accuracy on the test split.

All plots and reports are saved automatically to Model Training and Evaluation/module3_output/.

Inference Evaluation (on unseen image pairs)

The Single_Test.py script evaluates the full dual-image inference pipeline on new, unseen PCB image pairs:

Metric Computation Method
GT Match Rate Centroid Β±32 px proximity + class agreement
False Positive Rate FP / (FP + TN) across the confusion matrix
False Negative Rate FN / (FN + TP) across the confusion matrix

A ground-truth JSON file can optionally be provided for quantitative comparison. Without it, the script still produces annotated output images and scatter plots of all detections.


πŸ›  Installation

Prerequisites

Requirement Version
Windows 10 or 11
Python 3.12 (must be on PATH)
Git Any recent version
RAM β‰₯ 8 GB
Disk space β‰₯ 5 GB (venv + dataset)

Setup

# 1. Clone the repository
git clone https://github.com/Cheekurthi-Vamsi/PCB-Defect-Detection-and-Classification-System.git
cd PCB-Defect-Detection-and-Classification-System

# 2. Create a virtual environment
py -3.12 -m venv venv

# 3. Install all dependencies
venv\Scripts\pip install -r requirements.txt

Dependencies

tensorflow-cpu β‰₯ 2.12, < 2.17   # Neural network training and inference
opencv-python β‰₯ 4.8.0            # Image processing
Pillow β‰₯ 10.0.0                  # PIL image I/O and export
streamlit β‰₯ 1.32.0               # Web application framework
numpy β‰₯ 1.24.0                   # Array operations
pandas β‰₯ 2.0.0                   # DataFrame and CSV handling
matplotlib β‰₯ 3.7.0               # Training curve plots
seaborn β‰₯ 0.12.0                 # Confusion matrix heatmaps
scikit-learn β‰₯ 1.3.0             # Classification metrics
imutils β‰₯ 0.5.4                  # Contour grab utility

πŸš€ Usage

1. Download the Dataset

Download the DeepPCB dataset and extract it to:

Image Subtraction\DeepPCB_dataset\

Folder names must exactly match: Missing_hole, Mouse_bite, Open_circuit, Short, Spur, Spurious_copper

2. Train the Model

# One-click: double-click run_pipeline.bat
# Or manually:

venv\Scripts\python "Image Subtraction\image_subraction.py"
venv\Scripts\python "Contour Detection and ROI Extraction\roi_ext.py"
venv\Scripts\python "Model Training and Evaluation\Train.py"
Stage Script Output Time
Image processing image_subraction.py Aligned images + masks 2–10 min
ROI extraction roi_ext.py Labeled patch dataset 1–5 min
Model training Train.py best_model.keras + plots 30–90 min

3. Launch the Web App

# One-click: double-click run_app.bat
# Or manually:
venv\Scripts\streamlit run app.py

Open http://localhost:8501 in your browser.

4. Run Inference

  1. Upload a PCB image using the drag-and-drop zone
  2. Click πŸš€ Run Defect Detection
  3. Review the 4-panel visual output and prediction table
  4. Download your results using any of the four export buttons

πŸ“‹ Results Summary

Evaluation Metric Target Status
Classification accuracy (test ROI set) β‰₯ 95% Targeted
End-to-end inference speed (CPU) ≀ 3,000 ms ~200–400 ms βœ…
Defect localization All defect regions Adaptive + Canny fusion
Export functionality PNG, CSV, Mask, TXT All 4 implemented βœ…
False positive / negative rate Minimized Reported per-class

πŸ“ Project Structure

PCB/
β”‚
β”œβ”€β”€ app.py                                      # Streamlit web application
β”‚
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ inference.py                            # Core inference pipeline (all stages)
β”‚   └── __init__.py                             # Package initializer
β”‚
β”œβ”€β”€ Image Subtraction/
β”‚   β”œβ”€β”€ image_subraction.py                     # ORB align Β· absdiff Β· Otsu Β· morph
β”‚   β”œβ”€β”€ DeepPCB_dataset/                        # ← Download dataset here
β”‚   └── module1_output/                         # Generated: aligned, masks, diffs
β”‚
β”œβ”€β”€ Contour Detection and ROI Extraction/
β”‚   β”œβ”€β”€ roi_ext.py                              # Contour detect Β· crop Β· split
β”‚   └── module2_output/                         # Generated: ROI patches + visualizations
β”‚
β”œβ”€β”€ Model Training and Evaluation/
β”‚   β”œβ”€β”€ Train.py                                # EfficientNetB0 two-phase training
β”‚   β”œβ”€β”€ Single_Test.py                          # Dual-image evaluation pipeline
β”‚   β”œβ”€β”€ test_images/                            # ← Add unseen test pairs here
β”‚   β”œβ”€β”€ test_ground_truth.json                  # GT annotations template
β”‚   └── module3_output/                         # Generated: models, plots, reports
β”‚
β”œβ”€β”€ docs/
β”‚   β”œβ”€β”€ Technical_Documentation.md              # Full algorithm & architecture reference
β”‚   β”œβ”€β”€ User_Guide.md                           # Installation & usage walkthrough
β”‚   └── Presentation_Content.md                 # 14-slide presentation deck
β”‚
β”œβ”€β”€ requirements.txt                            # Python package dependencies
β”œβ”€β”€ run_app.bat                                 # Launch Streamlit web app
β”œβ”€β”€ run_pipeline.bat                            # Run full training pipeline
└── .gitignore                                  # Ignores venv, dataset, generated outputs

πŸ“š Documentation

Detailed documentation is available in the docs/ directory:

Document Contents
Technical Documentation Algorithm details, parameter tables, model architecture, backend function reference, performance characteristics
User Guide System requirements, installation steps, dataset setup, training walkthrough, web app usage, export guide, troubleshooting
Presentation Slides 14-slide deck covering problem statement, solution overview, dataset analysis, model architecture, evaluation results, and future roadmap

πŸ”­ Future Roadmap

Enhancement Priority Description
Real-time camera integration High Use st.camera_input for live line-of-sight inspection
Edge deployment (TFLite) High Quantize model for Jetson Nano / Raspberry Pi
PDF report export Medium Generate formal QA PDFs with reportlab
Defect history database Medium Persist results to MongoDB for trend analytics
Automated alerting Medium Email / SMS for critical defect types
Batch image processing Low Process entire folders without individual uploads
Confidence threshold slider Low User-adjustable minimum confidence filter in the UI

πŸ‘€ Author

Vamsi Cheekurthi
PCB Defect Detection & Classification System
2026


Built with Python Β· TensorFlow Β· OpenCV Β· Streamlit

About

The PCB Defect Detection & Classification System is a production-grade, end-to-end automated optical inspection (AOI) solution built to identify and classify manufacturing defects on Printed Circuit Boards with computer vision accuracy and deep learning intelligence.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors