A sophisticated movie recommendation system built with FalkorDB graph database, NLP (spaCy), and Streamlit for intelligent, relationship-based movie discovery.
- Graph-Based Recommendations: Leverages FalkorDB to find movies through actor, director, and genre relationships
- NLP-Powered Analysis: Uses spaCy for entity extraction and semantic analysis of movie plots
- Multiple Recommendation Strategies:
- Similar movies based on shared attributes
- Genre-based filtering with rating thresholds
- Actor and director filmographies
- Actor collaboration discovery
- Network exploration (degrees of separation)
- Interactive Web Interface: Clean, modern UI built with Streamlit
- Real-Time Search: Fast graph queries for instant results
βββββββββββββββββββ
β TMDb API β
β (Data Source) β
ββββββββββ¬βββββββββ
β
βΌ
βββββββββββββββββββ ββββββββββββββββ
β Data Collector ββββββΆβ NLP Processorβ
β (API Integration)β β (spaCy) β
ββββββββββ¬βββββββββ ββββββββ¬ββββββββ
β β
βΌ βΌ
βββββββββββββββββββββββββββββββββββ
β FalkorDB Graph Database β
β (Movies, Actors, Directors) β
ββββββββββ¬βββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββ ββββββββββββββββ
β Recommender ββββββΆβ Streamlit β
β (Graph Queries)β β (Frontend) β
βββββββββββββββββββ ββββββββββββββββ
- Graph Database: FalkorDB (Redis-based graph database)
- NLP: spaCy, sentence-transformers
- Data Source: TMDb (The Movie Database) API
- Backend: Python 3.12
- Frontend: Streamlit
- Data Processing: pandas, numpy
- Visualization: plotly, networkx
- Python 3.12+
- Docker (for FalkorDB)
- TMDb API key (Get one free)
- Clone the repository
git clone https://github.com/EngAdhamTamer/movie-knowledge-graph.git
cd movie-knowledge-graph- Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\Activate.ps1- Install dependencies
pip install -r requirements.txt
python -m spacy download en_core_web_sm- Configure environment
cp .env.example .env
# Edit .env and add your TMDb API key- Start FalkorDB
docker run -p 6379:6379 -it --rm falkordb/falkordb:latest- Populate database
python scripts/populate_graph.py --movies 30- Launch application
streamlit run app.pyVisit http://localhost:8501 in your browser!
// Nodes
(:Movie {movie_id, title, year, overview, rating, runtime, budget, revenue})
(:Actor {actor_id, name})
(:Director {director_id, name})
(:Genre {name})
// Relationships
(Actor)-[:ACTED_IN {character}]->(Movie)
(Director)-[:DIRECTED]->(Movie)
(Movie)-[:HAS_GENRE]->(Genre)
(Movie)-[:SIMILAR_TO {score}]->(Movie)from src.graph_builder import MovieGraphBuilder
from src.recommender import MovieRecommender
builder = MovieGraphBuilder()
recommender = MovieRecommender(builder)
# Get Tom Hanks movies
movies = recommender.recommend_by_actor("Tom Hanks", limit=10)# Find movies similar to Inception
similar = recommender.recommend_similar_movies("Inception", limit=5)# Find actors connected to Leonardo DiCaprio
network = recommender.get_actor_network("Leonardo DiCaprio", depth=2)Shows top-rated movies and database statistics
Find similar movies based on multiple factors
Discover connections between actors
movie-knowledge-graph/
βββ src/
β βββ data_collector.py # TMDb API integration
β βββ nlp_processor.py # NLP & embeddings
β βββ graph_builder.py # FalkorDB operations
β βββ recommender.py # Recommendation algorithms
βββ scripts/
β βββ populate_graph.py # Database population
βββ tests/
βββ app.py # Streamlit interface
βββ config.py # Configuration
βββ requirements.txt
- Collaborative Filtering: Based on shared actors, directors, and genres
- Content-Based: Using NLP embeddings of plot descriptions (planned)
- Graph-Based: Leveraging path traversal and relationship strength
- Hybrid: Combining multiple signals for better accuracy
- User ratings and personalized recommendations
- Sentiment analysis of movie reviews
- Movie poster similarity using computer vision
- REST API with FastAPI
- Deployment to cloud (Heroku/Streamlit Cloud)
- Advanced graph visualizations with D3.js
- Topic modeling for plot analysis
- Real-time trending movies analysis
- Query Speed: < 100ms for most graph queries
- Database Size: ~60 movies, 400+ actors, 60+ directors
- Scalability: Tested with up to 1000+ movies
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the project
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- TMDb for the excellent movie data API
- FalkorDB for the powerful graph database
- spaCy for NLP capabilities
- Streamlit for the amazing web framework
Your Name
- GitHub: @EngAdhamTamer
- LinkedIn: Your LinkedIn
Project Link: https://github.com/EngAdhamTamer/movie-knowledge-graph
β Star this repo if you find it helpful!