Skip to content

KJanzon/youtube-qa-chatbot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🐍 Python Video Q&A Chatbot

An AI-powered tutor for Python YouTube videos — ask, watch, learn, and code.

An interactive chatbot that lets you ask natural language questions and searches for answers in youtube videos — powered by LangChain, OpenAI, and ChromaDB. Specialised in Youtube videos for learning python programming.


🚀 Features

  • 🔗 Paste a YouTube video URL about Python coding and embed it directly in the app
  • 🧠 Ask questions about the content using natural language
  • 📖 Vector search over transcript chunks with timestamp + chapter metadata
  • 📺 Plays part of the video that answers the question
  • 📟 Additional explanations and coding challenge
  • 🤖 Powered by Llama3 8B + LangChain RetrievalQA

📊 View Project Presentation Slides


🛠️ Tech Stack

Component Description
Streamlit Frontend UI for chat, video player, and code interaction
LangChain Retrieval-Augmented Generation (RAG) orchestration
LangSmith Tracing and debugging of LLM chains and prompts
ChatGroq (LLaMA3-8B) LLM used for answering questions and generating challenges
OpenAIEmbeddings Converts transcript chunks into vector representations
ChromaDB Local vector database for storing per-video embeddings
pytubefix Downloads captions and extracts video metadata
GPT-4 (optional) Evaluates the quality of LLaMA3 responses post-hoc

📦 Setup Instructions

  1. Clone the repo

    git clone https://github.com/KJanzon/youtube-qa-chatbot.git
    cd youtube-qa-chatbot
  2. Set up virtual environment

    python -m venv venv
    source venv/bin/activate  # or .\venv\Scripts\activate on Windows
  3. Install dependencies

    pip install -r requirements.txt
  4. Add your API key Create a .env file with:

    OPENAI_API_KEY=your_openai_key_here
    LANGCHAIN_API_KEY
    HUGGINGFACEHUB_API_TOKEN
    GROQ_API_KEY
    
  5. Run the app

    streamlit run interfaces/streamlit_chat.py

📁 Folder Structure

├── app/                  # Video processing + transcript embedding
├── data/                 # Downloaded caption files (.srt)
├── interfaces/           # Streamlit front-end
├── utils/                # Helpers (e.g., clean_srt, time utils, chapter ranker)
├── vectorstore/          # ChromaDB persistent store (per-video)
├── .env                  # API key config (excluded from Git)

📸 Screenshot

Screenshot 2025-04-18 at 09 07 07 Screenshot 2025-04-18 at 11 54 56 Screenshot 2025-04-18 at 11 55 03

🧠 TODO / Roadmap

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages