🏆 Amazon ML Challenge 2025: Smart Product Pricing

Multi-Modal Deep Learning Pipeline for Intelligent Price Prediction

🎯 Problem Statement

Predict product prices using multi-modal data:

📝 Product descriptions (text)
🖼️ Product images
💰 Historical pricing

🏗️ Solution Architecture

High-Level Pipeline Overview

    ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
    ┃              INPUT LAYER (Raw Data)                   ┃
    ┣━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┫
    ┃  📝 Product Text  │  🖼️ Product Images  │  💰 Prices   ┃
    ┗━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┛
                 ║                         ║
                 ▼                         ▼
    ┏━━━━━━━━━━━━━━━━━━━━━━┓  ┏━━━━━━━━━━━━━━━━━━━━━━┓
    ┃   STAGE 1: Extract   ┃  ┃  STAGE 2: Generate   ┃
    ┃   Base Features      ┃  ┃  Multi-Modal Data    ┃
    ┃                      ┃  ┃                      ┃
    ┃  • Text Parsing      ┃  ┃  • Text Embeddings   ┃
    ┃  • Unit Convert      ┃  ┃  • Image Embeddings  ┃
    ┃  • Brand Encoding    ┃  ┃  • KNN Features      ┃
    ┃  • TF-IDF Features   ┃  ┃                      ┃
    ┃                      ┃  ┃  Output: 3,212 feat. ┃
    ┃  Output: 520 feat.   ┃  ┃                      ┃
    ┗━━━━━━━━┬━━━━━━━━━━━━┛  ┗━━━━━━━━┬━━━━━━━━━━━━┛
             │                        │
             └────────────┬───────────┘
                          ▼
    ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
    ┃         STAGE 3: ENSEMBLE MODELING                    ┃
    ┃                                                       ┃
    ┃   📊 Consolidated Features: ~3,732                   ┃
    ┃                                                       ┃
    ┃   ┌──────────────────────┐  ┌──────────────────────┐ ┃
    ┃   │   LightGBM Model     │  │  CatBoost Model      │ ┃
    ┃   │   • 1000 trees       │  │  • 1000 trees        │ ┃
    ┃   │   • Depth: 7         │  │  • Depth: 7          │ ┃
    ┃   │   • LR: 0.05         │  │  • LR: 0.05          │ ┃
    ┃   │   • SMAPE: 54.78%    │  │  • SMAPE: 54.52%     │ ┃
    ┃   └──────────┬───────────┘  └──────────┬───────────┘ ┃
    ┃              │                         │              ┃
    ┃              └─────────────┬───────────┘              ┃
    ┃                            ▼                          ┃
    ┃               Final = (LightGBM + CatBoost) / 2      ┃
    ┃                                                       ┃
    ┗━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
                              ▼
    ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
    ┃         🎯 OUTPUT: Price Predictions                 ┃
    ┃         ✅ SMAPE: 53.99%                              ┃
    ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛

STAGE 1: Feature Extraction (520 features)

INPUT: Product Text Descriptions
    ▼
┌─────────────────────────────────────────────────────────┐
│              TEXT PARSING & CLEANING                    │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────┐  │
│  │  Pack Qty    │  │   Weight     │  │   Volume     │  │
│  │  Extraction  │  │  Extraction  │  │  Extraction  │  │
│  └──────────────┘  └──────────────┘  └──────────────┘  │
└────────────────┬────────────────────────────────────────┘
                 ▼
┌─────────────────────────────────────────────────────────┐
│           UNIT STANDARDIZATION                          │
│  Weight → grams  │  Volume → milliliters  │ Values OK  │
└────────────────┬────────────────────────────────────────┘
                 ▼
┌─────────────────────────────────────────────────────────┐
│              FEATURE ENGINEERING                        │
│  ┌──────────────────────────────────────────────────┐  │
│  │  • Brand Target Encoding (smoothed)              │  │
│  │  • TF-IDF Features from description (500 feat.)  │  │
│  │  • Interaction Features                          │  │
│  │    - pack_weight_ratio                           │  │
│  │    - brand_avg_price                             │  │
│  │  • Price Normalization & Smoothing               │  │
│  └──────────────────────────────────────────────────┘  │
└────────────────┬────────────────────────────────────────┘
                 ▼
         ✅ OUTPUT: 520 numerical features

STAGE 2: Multi-Modal Embeddings (3,212 features)

Part A: Parallel Embedding Generation

┌────────────────────────────────────┐    ┌────────────────────────────────────┐
│    TEXT EMBEDDINGS                 │    │    IMAGE EMBEDDINGS                │
│    (5 Transformer Models)          │    │    (EfficientNet-B0 CNN)           │
├────────────────────────────────────┤    ├────────────────────────────────────┤
│                                    │    │                                    │
│  1️⃣  MiniLM-L6-v2                  │    │  🖼️  EfficientNet-B0               │
│      └─ 384-dim vectors            │    │      └─ Pre-trained on ImageNet    │
│                                    │    │      └─ Global Average Pooling    │
│  2️⃣  Multilingual MiniLM           │    │      └─ 1280-dimensional output   │
│      └─ 384-dim vectors            │    │                                    │
│                                    │    │                                    │
│  3️⃣  all-MiniLM-L12                │    │                                    │
│      └─ 384-dim vectors            │    │                                    │
│                                    │    │                                    │
│  4️⃣  distiluse-base                │    │                                    │
│      └─ 384-dim vectors            │    │                                    │
│                                    │    │                                    │
│  5️⃣  all-MiniLM-L6                 │    │                                    │
│      └─ 384-dim vectors            │    │                                    │
│                                    │    │                                    │
│  Total: 5 × 384 = 1920 features   │    │  Total: 1280 features             │
└────────────────┬───────────────────┘    └────────────────┬───────────────────┘
                 │                                        │
                 └────────────────┬─────────────────────────┘
                                  ▼
                ┌──────────────────────────────────────┐
                │ CONCATENATED EMBEDDING SPACE         │
                │ 1920 + 1280 = 3200 dimensions       │
                └──────────────┬───────────────────────┘

Part B: KNN Similarity Features

Using 3,200-dimensional Concatenated Embeddings
    ▼
┌──────────────────────────────────────────────────────────────┐
│              K-NEAREST NEIGHBORS ANALYSIS (K=10)             │
├──────────────────────────────────────────────────────────────┤
│                                                              │
│  📊 TEXT SPACE KNN                                           │
│  ├─ Mean Price of 10 nearest neighbors                       │
│  ├─ Std Dev of neighbor prices                              │
│  ├─ Min price among neighbors                               │
│  └─ Max price among neighbors                               │
│                                                              │
│  🖼️ IMAGE SPACE KNN                                          │
│  ├─ Mean Price of 10 nearest neighbors                       │
│  ├─ Std Dev of neighbor prices                              │
│  ├─ Min price among neighbors                               │
│  └─ Max price among neighbors                               │
│                                                              │
│  🎯 COMBINED SPACE KNN                                       │
│  ├─ Mean Price of 10 nearest neighbors                       │
│  ├─ Std Dev of neighbor prices                              │
│  ├─ Min price among neighbors                               │
│  └─ Max price among neighbors                               │
│                                                              │
│  Total KNN Features: 4 metrics × 3 spaces = 12 features    │
└──────────────────┬───────────────────────────────────────────┘

Part C: Stage 2 Output Summary

┌─────────────────────────────────────────────────┐
│   STAGE 2 CONSOLIDATED OUTPUT                   │
├─────────────────────────────────────────────────┤
│  • Text Embeddings (5 models):    1920 feat.   │
│  • Image Embeddings (1 model):    1280 feat.   │
│  • KNN Features (3 spaces):          12 feat.   │
│  ─────────────────────────────────────────────  │
│  TOTAL:                            3212 feat.  │
└─────────────────────────────────────────────────┘

STAGE 3: Ensemble Modeling (~3,732 features)

┌─────────────────────────────────────────────────────────────┐
│        CONSOLIDATED FEATURE MATRIX                          │
├─────────────────────────────────────────────────────────────┤
│  From STAGE 1:  Numerical Features        ~520 feat.       │
│  From STAGE 2:  Text Embeddings           1920 feat.       │
│  From STAGE 2:  Image Embeddings          1280 feat.       │
│  From STAGE 2:  KNN Features                 12 feat.       │
│  ─────────────────────────────────────────────────────────  │
│  TOTAL:                                   ~3,732 feat.     │
└────────────────┬────────────────────────────────────────────┘
                 ▼
     ┌──────────────────────────────────────┐
     │   SPLIT: Train & Validation Data     │
     └──────────────────┬───────────────────┘
                        ▼
        ┌────────────────────────────────────┐
        │     GRADIENT BOOSTING MODELS       │
        ├────────────────────────────────────┤
        │                                    │
        │  🟦 LightGBM                       │
        │  ├─ Trees: 1000                   │
        │  ├─ Max Depth: 7                  │
        │  ├─ Learning Rate: 0.05           │
        │  ├─ Num Leaves: 31                │
        │  └─ SMAPE: 54.78%                 │
        │                                    │
        │  🟪 CatBoost                       │
        │  ├─ Trees: 1000                   │
        │  ├─ Max Depth: 7                  │
        │  ├─ Learning Rate: 0.05           │
        │  ├─ Handle Cat Features: Yes      │
        │  └─ SMAPE: 54.52%                 │
        │                                    │
        └────────────────┬───────────────────┘
                         ▼
        ┌────────────────────────────────────┐
        │   ENSEMBLE AVERAGING               │
        │   Final = (LightGBM + CatBoost)/2  │
        └────────────────┬───────────────────┘
                         ▼
        ✅ FINAL PREDICTIONS
           SMAPE: 53.99% ⭐

📊 Complete Data Flow Diagram

                    ┌──────────────────┐
                    │   INPUT DATA     │
                    └────────┬─────────┘
                             │
                    ┌────────┴─────────┐
                    ▼                  ▼
            ┌───────────────┐  ┌──────────────┐
            │  TEXT DATA    │  │  IMAGE DATA  │
            └───────┬───────┘  └──────┬───────┘
                    │                 │
                    └────────┬────────┘
                             ▼
            ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
            ┃  STAGE 1: EXTRACT FEATURES  ┃
            ┃  Output: 520 features       ┃
            ┗━━━━━━━━━━┬━━━━━━━━━━━━━━━━━┛
                       ▼
            ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
            ┃ STAGE 2: EMBEDDINGS + KNN   ┃
            ┃ Output: 3,212 features      ┃
            ┗━━━━━━━━━━┬━━━━━━━━━━━━━━━━━┛
                       ▼
            ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
            ┃ STAGE 3: ENSEMBLE TRAINING  ┃
            ┃ 3,732 consolidated features │
            ┃ LightGBM + CatBoost average │
            ┗━━━━━━━━━━┬━━━━━━━━━━━━━━━━━┛
                       ▼
            ┌──────────────────────────┐
            │  🎯 PRICE PREDICTIONS    │
            │  ✅ SMAPE: 53.99%        │
            └──────────────────────────┘

📈 Feature Importance Distribution

┌─────────────────────────────────────────────────┐
│        TOP 10 FEATURE CONTRIBUTIONS             │
├─────────────────────────────────────────────────┤
│                                                 │
│  1. knn_mean_price_combined  ████████████ 18.2%│
│  2. brand_target_encoded     ████████ 12.4%    │
│  3. knn_mean_price_text      ██████ 9.8%       │
│  4. weight_grams             █████ 7.3%        │
│  5. knn_std_price_combined   ████ 6.5%         │
│  6. image_embedding_0        ███ 5.1%          │
│  7. pack_quantity            ███ 4.9%          │
│  8. knn_mean_price_image     ██ 4.2%           │
│  9. volume_ml                ██ 3.8%           │
│  10. text_embedding_1_0      ██ 3.1%           │
│                                                 │
└─────────────────────────────────────────────────┘

📊 Model Performance

╔═══════════════════════════════════════════════════════════╗
║              PERFORMANCE METRICS                          ║
╠═══════════════════════════════════════════════════════════╣
║                                                           ║
║  📊 LightGBM (Single Model)        →  54.78% SMAPE       ║
║  📊 CatBoost (Single Model)        →  54.52% SMAPE       ║
║                                                           ║
║  ✅ ENSEMBLE (Average)             →  53.99% SMAPE ⭐    ║
║                                                           ║
║  🚀 Improvement from Ensemble:  ↓ 0.53% SMAPE           ║
║                                                           ║
╚═══════════════════════════════════════════════════════════╝

🚀 Quick Start

# Stage 1: Extract features
python enhanced_feature_extraction.py

# Stage 2: Generate embeddings (GPU recommended)
python generate_all_embeddings.py

# Stage 3: Train and predict
python validate_and_submit_ensemble.py

📦 Tech Stack

┌─────────────────────────────────────────────┐
│  DEEP LEARNING                              │
│  • PyTorch                                  │
│  • Sentence Transformers                    │
│  • EfficientNet (timm)                      │
├─────────────────────────────────────────────┤
│  GRADIENT BOOSTING                          │
│  • LightGBM                                 │
│  • CatBoost                                 │
├─────────────────────────────────────────────┤
│  SIMILARITY SEARCH                          │
│  • FAISS (Facebook AI)                      │
├─────────────────────────────────────────────┤
│  DATA PROCESSING                            │
│  • Pandas, NumPy, Scikit-learn              │
└─────────────────────────────────────────────┘

💡 Key Innovation

Our solution's strength lies in three-level similarity features:

🔤 Text-based neighbors → Semantically similar products
🖼️ Image-based neighbors → Visually similar products
🎯 Combined neighbors → Holistically similar products

Each level captures different pricing patterns, creating a robust feature set.

🎓 Results Summary

✅ Validation SMAPE: 53.99%
✅ 3,732 engineered features
✅ 5 text embedding models
✅ Multi-space KNN features
✅ Robust ensemble approach

⭐ Star this repo if you found it helpful! ⭐

Made with ❤️ for Amazon ML Challenge 2025

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
__MACOSX		__MACOSX
student_resource		student_resource
README.md		README.md
advanced-embeddings.ipynb		advanced-embeddings.ipynb
advanced-feature-extraction.ipynb		advanced-feature-extraction.ipynb
advanced-model-training.xpynb		advanced-model-training.xpynb
embeddings-text-images.ipynb		embeddings-text-images.ipynb
feature-extraction.ipynb		feature-extraction.ipynb
notebook-1.ipynb		notebook-1.ipynb
test-notebook1.ipynb		test-notebook1.ipynb
test-notebook2.ipynb		test-notebook2.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🏆 Amazon ML Challenge 2025: Smart Product Pricing

🎯 Problem Statement

🏗️ Solution Architecture

High-Level Pipeline Overview

STAGE 1: Feature Extraction (520 features)

STAGE 2: Multi-Modal Embeddings (3,212 features)

Part A: Parallel Embedding Generation

Part B: KNN Similarity Features

Part C: Stage 2 Output Summary

STAGE 3: Ensemble Modeling (~3,732 features)

📊 Complete Data Flow Diagram

📈 Feature Importance Distribution

📊 Model Performance

🚀 Quick Start

📦 Tech Stack

💡 Key Innovation

🎓 Results Summary

About

Uh oh!

Releases

Packages

Languages

Prahlad-07/Amazon-ML-Team-int_64t

Folders and files

Latest commit

History

Repository files navigation

🏆 Amazon ML Challenge 2025: Smart Product Pricing

🎯 Problem Statement

🏗️ Solution Architecture

High-Level Pipeline Overview

STAGE 1: Feature Extraction (520 features)

STAGE 2: Multi-Modal Embeddings (3,212 features)

Part A: Parallel Embedding Generation

Part B: KNN Similarity Features

Part C: Stage 2 Output Summary

STAGE 3: Ensemble Modeling (~3,732 features)

📊 Complete Data Flow Diagram

📈 Feature Importance Distribution

📊 Model Performance

🚀 Quick Start

📦 Tech Stack

💡 Key Innovation

🎓 Results Summary

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages