Skip to content

razamehar/MultiModal-RAG-with-Graph-Understanding

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MultiModal RAG with Graph Understanding

This project implements a Multimodal Retrieval-Augmented Generation (RAG) system that extracts and processes text, tables, and images (including graphs and charts) from PDF documents. It generates captions for images using OpenAI’s GPT-4o, builds a vector index over all extracted content, and answers user queries by retrieving relevant multimodal information.


Features

  • Extracts text content from PDFs using LangChain’s PyPDFLoader.
  • Extracts tables from PDFs using pdfplumber and converts them into textual documents.
  • Extracts images from PDFs using PyMuPDF (fitz), saves them locally, and generates captions describing charts, plots, or images with OpenAI GPT-4o.
  • Combines text, tables, and image captions into a unified document corpus.
  • Creates a FAISS vector index with OpenAI embeddings for efficient retrieval.
  • Uses a retrieval-augmented generation chain with GPT-4o to answer queries based on multimodal PDF content.
  • Specifically designed to understand and explain graphical data and visual elements within PDFs.

Installation

pip install langchain openai pdfplumber pymupdf pillow faiss-cpu python-dotenv

Environment File

Create a .env file in the project root with your OpenAI API key:

OPENAI_API_KEY=your_openai_api_key_here

Usage

  1. Put your PDF file in the project directory and update the pdf_file variable in the script.
  2. Run the script:
python main.py
  1. The script will:
    • Extract text, tables, and images with captions from the PDF.
    • Index all extracted documents.
    • Answer queries such as "What do the graphs show in this PDF?".
    • Print the generated answer.

Contact

For any questions or clarifications, please contact Raza Mehar at [[email protected]].

Releases

No releases published

Packages

No packages published