An intelligent medical assistant chatbot powered by AI that provides accurate medical information using Retrieval-Augmented Generation (RAG) with LangChain, Pinecone, and Groq.
- Overview
- Features
- Tech Stack
- Architecture
- Installation
- Configuration
- Usage
- Project Structure
- API Endpoints
This Medical Chatbot uses state-of-the-art natural language processing to answer medical questions by retrieving relevant information from a curated knowledge base of medical documents. The system combines semantic search with large language models to provide accurate, context-aware responses.
- π€ AI-Powered Responses: Uses Groq's Llama 3.3 70B model for intelligent answers
- π Semantic Search: Leverages Pinecone vector database for fast, accurate document retrieval
- π PDF Knowledge Base: Processes medical PDFs to build a comprehensive knowledge base
- π¬ Interactive Web Interface: User-friendly chat interface built with Flask
- π― Context-Aware: Retrieves top 3 most relevant documents for each query
- β‘ Fast Response Time: Optimized for quick inference with Groq API
- Python 3.10+
- Flask: Web framework
- LangChain: LLM orchestration framework
- LangChain Groq: Groq integration for LLM inference
- LangChain Pinecone: Vector store integration
- LangChain HuggingFace: Embeddings model
- Groq API: Fast LLM inference (Llama 3.3 70B Versatile)
- HuggingFace: Embeddings (sentence-transformers/all-MiniLM-L6-v2)
- Pinecone: Vector database for semantic search
- PyPDF: PDF parsing
- RecursiveCharacterTextSplitter: Text chunking
User Query
β
Flask Web App
β
LangChain RAG Pipeline
β
βββ HuggingFace Embeddings (384-dim vectors)
β
Pinecone Vector Store (Similarity Search)
β
Top 3 Relevant Documents
β
Groq LLM (Llama 3.3 70B)
β
Generated Response
β
User Interface
- Python 3.10 or higher
- Anaconda/Miniconda (recommended)
- Pinecone account
- Groq API account
git clone https://github.com/nadamankai/Medical-Chatbot.git
cd Medical-Chatbotconda create -n medibot python=3.10 -y
conda activate medibotpip install -r requirements.txtCreate a .env file in the root directory:
PINECONE_API_KEY=your_pinecone_api_key_here
GROQ_API_KEY=your_groq_api_key_here- Go to https://www.pinecone.io/
- Sign up for a free account
- Navigate to API Keys section
- Copy your API key
- Go to https://console.groq.com/
- Sign up for a free account
- Navigate to API Keys section
- Create a new API key
- Copy your API key (starts with
gsk_)
- Place your medical PDF files in the
data/directory - Run the indexing script to create embeddings:
python store_index.pyThis will:
- Load all PDFs from the
data/directory - Split documents into chunks (500 chars with 20 char overlap)
- Generate embeddings using HuggingFace model
- Store vectors in Pinecone index named
medical-bot
python app.pyThe application will start on http://localhost:8080
- Open your browser and navigate to
http://localhost:8080 - Type your medical question in the chat interface
- Press Enter or click Send
- Wait for the AI-generated response
- "What is diabetes?"
- "What are the symptoms of hypertension?"
- "How is acne treated?"
- "What causes anemia?"
Medical-Chatbot/
βββ app.py # Main Flask application
βββ store_index.py # Script to create Pinecone index
βββ requirements.txt # Python dependencies
βββ setup.py # Package setup file
βββ .env # Environment variables (not in repo)
βββ README.md # Project documentation
β
βββ data/ # Medical PDF documents
β βββ *.pdf
β
βββ src/ # Source code modules
β βββ __init__.py
β βββ helper.py # Helper functions for data processing
β βββ prompt_template.py # System prompt for the chatbot
β
βββ templates/ # HTML templates
β βββ chat.html # Chat interface
β
βββ research/ # Jupyter notebooks for experimentation
βββ trials.ipynb
Renders the main chat interface.
Response: HTML page
Handles chat messages and returns bot responses.
Request:
{
"msg": "What is diabetes?"
}Response:
Diabetes is a chronic condition characterized by high blood sugar levels...

