A standalone tool to compute diversity scores for collections of chart images using DINOv2 embeddings, with primary focus on the VOL (Volume) metric.
VOL (Volume) is a diversity metric that measures how well a collection of images spans the embedding space. It's computed as the geometric mean of eigenvalues of the Gram matrix (similarity matrix) of L2-normalized embeddings.
- Range: [0, 1]
- Higher VOL = more diverse/spread out images
- VOL = 1 indicates maximum diversity (orthogonal embeddings)
- VOL β 0 indicates low diversity (similar/duplicate images)
VOL is particularly useful for:
- Detecting duplicate or near-duplicate images
- Measuring dataset diversity
- Evaluating synthetic data generation quality
- Comparing different chart collections
- π Uses state-of-the-art DINOv2 vision transformer
- π Focuses on VOL metric with supporting metrics
- πΌοΈ Supports multiple image formats (PNG, JPG, SVG, etc.)
- β‘ GPU-accelerated (with CPU fallback)
- πΎ Optional saving of embeddings and scores
- π Eigenvalue statistics for deeper analysis
- Python 3.9 or higher
- CUDA-capable GPU (optional)
- uv - Fast Python package installer
Option 1: Automated Setup (Recommended)
bash setup.shThis script will:
- Check Python version
- Install uv if needed
- Install all dependencies
- Verify the installation
Option 2: Manual Setup
- Install uv (if not already installed):
# Via curl (Linux/macOS)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Via Homebrew (macOS)
brew install uv
# Via pip (any platform)
pip install uv- Install dependencies:
With virtual environment (recommended):
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
uv pip install -e .Or system-wide:
uv pip install --system -e .Why uv? It's 10-100x faster than pip and provides better dependency resolution
python compute_diversity.py /path/to/your/chartspython compute_diversity.py /path/to/your/charts --save-scorespython compute_diversity.py /path/to/your/charts \
--save-scores \
--save-embeddings \
--output-dir resultspython compute_diversity.py /path/to/your/charts --cpu# Larger batch size for more GPU memory
python compute_diversity.py /path/to/your/charts --batch-size 32
# Smaller batch size for less GPU memory
python compute_diversity.py /path/to/your/charts --batch-size 8positional arguments:
input_dir Folder with chart images (supports PNG, JPG, SVG, etc.)
optional arguments:
-h, --help Show help message and exit
--output-dir DIR Output folder for results (default: diversity_out)
--batch-size N Batch size for processing (default: 16)
--cpu Force CPU usage (disable GPU)
--save-embeddings Save embeddings to file
--save-scores Save scores to file
The script prints comprehensive results including:
- Primary Metric: VOL score
- Eigenvalue Statistics: Mean, std, min, max
- Supporting Metrics: MPD, RAD, Q10, ENT
Example output:
============================================================
π― DIVERSITY SCORES (VOL FOCUS)
============================================================
Number of images: 50
Embedding dimension: 768
------------------------------------------------------------
π· PRIMARY METRIC:
VOL (Volume): 0.234567
Interpretation:
β’ VOL represents the 'volume' of the convex hull in embedding space
β’ Higher VOL = more diverse/spread out images
β’ Range: [0, 1], where 1 = maximum diversity
β’ Computed as geometric mean of eigenvalues
------------------------------------------------------------
π EIGENVALUE STATISTICS:
Mean: 1.234567
Std: 0.123456
Min: 0.012345
Max: 2.345678
------------------------------------------------------------
π SUPPORTING METRICS:
MPD (Mean Pairwise Distance): 0.456789
RAD (Minimum Distance): 0.012345
Q10 (10th Percentile): 0.123456
ENT (Normalized Entropy): 0.789012
============================================================
When using --save-scores, a detailed text file is saved with:
- All metrics and scores
- Eigenvalue statistics
- VOL interpretation guide
When using --save-embeddings, a NumPy array file (.npy) is saved containing the DINOv2 embeddings for all images.
- PNG (
.png) - JPEG (
.jpg,.jpeg) - WebP (
.webp) - BMP (
.bmp) - TIFF (
.tif,.tiff) - SVG (
.svg) - requires cairosvg
- VOL (Volume): Geometric mean of eigenvalues of the Gram matrix. Measures the "volume" spanned by embeddings in high-dimensional space.
- MPD (Mean Pairwise Distance): Average distance between all pairs of images
- RAD (Radius): Minimum pairwise distance (detects duplicates)
- Q10 (10th Percentile): 10th percentile of pairwise distances (robust to outliers)
- ENT (Normalized Entropy): Entropy of eigenvalue distribution (normalized by log(n))
- Mean: Average eigenvalue (indicates overall embedding spread)
- Std: Standard deviation (indicates variability in dimensions)
- Min: Smallest eigenvalue (detects collapsed dimensions)
- Max: Largest eigenvalue (detects dominant directions)
This tool uses DINOv2 ViT-Base-Patch14 (vit_base_patch14_dinov2), a self-supervised vision transformer trained on diverse image data. It produces 768-dimensional embeddings that capture semantic visual features.
- State-of-the-art for visual similarity
- No task-specific fine-tuning needed
- Robust to image variations
- Excellent for chart/diagram understanding
1. Compute similarity matrix: S = E @ E^T (where E are normalized embeddings)
2. Compute eigenvalues: Ξ»β, Ξ»β, ..., Ξ»β = eig(S)
3. VOL = (β Ξ»α΅’)^(1/n) = geometric mean of eigenvalues
- Use GPU: Ensure PyTorch is installed with CUDA support
- Increase batch size: Use
--batch-size 32or higher if GPU memory allows - Monitor memory: Reduce batch size if you encounter OOM errors
- Use CPU: Add
--cpuflag if GPU overhead is not worth it (< 100 images) - Smaller batch size: Use
--batch-size 8to reduce memory usage
- 100 images: ~30 seconds (GPU) / ~2 minutes (CPU)
- 1000 images: ~5 minutes (GPU) / ~20 minutes (CPU)
- 10000 images: ~45 minutes (GPU) / ~3 hours (CPU)
Times are approximate and depend on hardware
Reduce batch size:
python compute_diversity.py /path/to/charts --batch-size 4Ensure your folder contains supported image formats and check file permissions.
If SVG files fail to load:
# On macOS
brew install cairo
# On Ubuntu/Debian
sudo apt-get install libcairo2-dev
# Then reinstall cairosvg
uv pip install --upgrade cairosvgThis tool is provided as-is for research and evaluation purposes.