MMEL: Multimodal Entity Linking with VLMs on WikiData

Research Project
Multimodal Entity Linking with Vision and Text on WikiData
Supervisor: Dr. Mehwish Alam
December 2024 – Ongoing

Project Overview

This repository contains code and data for MMEL, a research project exploring how to link visual and textual content to WikiData entities (QIDs). We:

Fine-tune two vision-language models (CLIP and LLAVA NEXT) on paired images & text.
Map learned embeddings to WikiData Knowledge Graph by computing QID similarities.
Extend the pipeline to multi-entity images and videos.
Analyze “dark spots” in VLM predictions—masking behaviors on sensitive or under-represented data.

Repository Structure

MMEL/
├── Data/                        # Raw & preprocessed datasets
│   ├── images/                  # Image assets
│   └── text/                    # Text captions / labels
│
├── fine_tune_models/            # Scripts & configs to fine-tune CLIP, LLAVA NEXT
│   ├── train_clip.py
│   └── train_llava.py
│
├── lava_predictions/            # Stored inference outputs from LLAVA NEXT
│
├── test_and_vis/                # Evaluation scripts & visualization notebooks
│
├── .gitignore
├── dataset_analysis.py          # Exploratory data analysis
├── embedding_similarity.py      # QID similarity mapping
├── llava_vlm.py                 # LLAVA NEXT wrapper & helper functions
├── model.py                     # Unified model definitions
├── read_predictions.py          # Loader for prediction files
├── test.py                      # End-to-end pipeline runner
├── utils.py                     # Common utilities
├── vlm_models.py                # CLIP / other VLM wrappers
└── wikidata_linking.py          # Core linking logic to WikiData KG

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MMEL: Multimodal Entity Linking with VLMs on WikiData

Project Overview

Repository Structure

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Data		Data
fine_tune_models		fine_tune_models
lava_predictions		lava_predictions
test_and_vis		test_and_vis
.gitignore		.gitignore
README.md		README.md
dataset_analysis.py		dataset_analysis.py
embedding_similarity.py		embedding_similarity.py
llava_vlm.py		llava_vlm.py
model.py		model.py
read_predictions.py		read_predictions.py
test.py		test.py
utils.py		utils.py
vlm_models.py		vlm_models.py
wikidata_linking.py		wikidata_linking.py

vijaysr4/MMEL

Folders and files

Latest commit

History

Repository files navigation

MMEL: Multimodal Entity Linking with VLMs on WikiData

Project Overview

Repository Structure

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages