An intelligent middleware service designed to enhance the default search capabilities of the Koha Library Online Public Access Catalog (OPAC). When a user's search yields "No results found", this system seamlessly intervenes to provide context-aware typo corrections ("Did you mean?") or highly relevant book suggestions ("Related Items").
- Context-Aware "Did You Mean?":
- Utilizes
fuzzywuzzystring matching to detect typos. - Adjusts logic based on the search field (e.g., checks against live author data for author searches, and title data for title searches).
- Prevents "subset" false positives (e.g., rejecting "Star" when the user searches for "The sun is also a star").
- Utilizes
- Intent-Driven "Related Items":
- Hybrid Signal Strategy: Pre-processes failed queries to extract core keywords, queries the Google Books API for rich metadata signals (subjects, authors), and cross-references them against local catalogue data.
- Smart Scoring: Ranks candidate books using a weighted scoring algorithm based on keyword matches, subject overlap, author matches, and a popularity multiplier derived from historical circulation data.
- Seamless Frontend Integration: Injects UI updates directly into the Koha OPAC via a lightweight jQuery snippet without modifying Koha's core backend code.
The system operates on a Hybrid Data Model:
- Live Koha API: Fetches real-time catalog data for accurate typo corrections.
- Google Books API: Acts as a semantic engine to gather related subjects and authors based on search intent.
- Local CSV Data: Uses
catalogue.csvandcirculation.csvexports to perform lightning-fast, complex cross-referencing and popularity ranking without overloading the live Koha database.
Tech Stack:
- Backend: Python, FastAPI, Pandas, FuzzyWuzzy, Uvicorn
- Frontend: JavaScript, jQuery (injected via Koha
OPACUserJS) - Deployment: Gunicorn, Systemd, Apache2 (Reverse Proxy)
- Python 3.10+
- Koha ILS (v24.05+) with API access enabled (
biblios:searchpermissions). - Google Books API Key.
- Exports of your Koha
catalogueandcirculationdata in CSV format.
Clone the repository and set up the environment:
git clone https://github.com/yourusername/koha-recommendation-engine.git
cd koha-recommendation-engine
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtCreate a .env file in the root directory:
GOOGLE_BOOKS_API_KEY="your_google_api_key"
KOHA_API_BASE_URL="your_koha_base_api_url"
KOHA_CLIENT_ID="your_koha_client_id"
KOHA_CLIENT_SECRET="your_koha_client_secret"Place your exported data files into the data/ directory:
data/catalogue.csvdata/circulation.csv
Run the development server:
uvicorn src.main:app --reload --port 5000- Log in to the Koha Staff Client.
- Navigate to Koha Administration -> Global system preferences -> OPAC ->
OPACUserJS. - Paste the contents of the integration script (located in
docs/frontend_snippet.jsor refer to the JS script provided in the project). - Update the
backendUrlvariable in the script to point to your deployed FastAPI server (e.g.,https://koha.yourdomain.edu/recommender-api/search-enrichment).
For production, it is recommended to run the FastAPI application using Gunicorn managed by Systemd, and served behind an Apache2 reverse proxy to handle SSL/TLS termination and prevent mixed-content errors.
- Semantic Search via NLP: Transition from a rule-based signal strategy to Vector embeddings (Word2Vec/Doc2Vec) using FAISS for true semantic relationship mapping.
- Multilingual Transliteration: Improve "Did You Mean" support for non-roman scripts using indic-transliteration libraries.
- Personalization: Integrate individual user borrowing histories to weight recommendations based on user affinity profiles.
See FUTURE_ENHANCEMENTS.md for detailed implementation plans.