Skip to content

ironLegion88/library_recommender

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Koha OPAC Search Enrichment & Recommendation Engine

Python FastAPI Koha

An intelligent middleware service designed to enhance the default search capabilities of the Koha Library Online Public Access Catalog (OPAC). When a user's search yields "No results found", this system seamlessly intervenes to provide context-aware typo corrections ("Did you mean?") or highly relevant book suggestions ("Related Items").

Features

  • Context-Aware "Did You Mean?":
    • Utilizes fuzzywuzzy string matching to detect typos.
    • Adjusts logic based on the search field (e.g., checks against live author data for author searches, and title data for title searches).
    • Prevents "subset" false positives (e.g., rejecting "Star" when the user searches for "The sun is also a star").
  • Intent-Driven "Related Items":
    • Hybrid Signal Strategy: Pre-processes failed queries to extract core keywords, queries the Google Books API for rich metadata signals (subjects, authors), and cross-references them against local catalogue data.
    • Smart Scoring: Ranks candidate books using a weighted scoring algorithm based on keyword matches, subject overlap, author matches, and a popularity multiplier derived from historical circulation data.
  • Seamless Frontend Integration: Injects UI updates directly into the Koha OPAC via a lightweight jQuery snippet without modifying Koha's core backend code.

Architecture

The system operates on a Hybrid Data Model:

  1. Live Koha API: Fetches real-time catalog data for accurate typo corrections.
  2. Google Books API: Acts as a semantic engine to gather related subjects and authors based on search intent.
  3. Local CSV Data: Uses catalogue.csv and circulation.csv exports to perform lightning-fast, complex cross-referencing and popularity ranking without overloading the live Koha database.

Tech Stack:

  • Backend: Python, FastAPI, Pandas, FuzzyWuzzy, Uvicorn
  • Frontend: JavaScript, jQuery (injected via Koha OPACUserJS)
  • Deployment: Gunicorn, Systemd, Apache2 (Reverse Proxy)

Installation & Setup

Prerequisites

  • Python 3.10+
  • Koha ILS (v24.05+) with API access enabled (biblios:search permissions).
  • Google Books API Key.
  • Exports of your Koha catalogue and circulation data in CSV format.

1. Backend Setup

Clone the repository and set up the environment:

git clone https://github.com/yourusername/koha-recommendation-engine.git
cd koha-recommendation-engine
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Create a .env file in the root directory:

GOOGLE_BOOKS_API_KEY="your_google_api_key"
KOHA_API_BASE_URL="your_koha_base_api_url"
KOHA_CLIENT_ID="your_koha_client_id"
KOHA_CLIENT_SECRET="your_koha_client_secret"

Place your exported data files into the data/ directory:

  • data/catalogue.csv
  • data/circulation.csv

Run the development server:

uvicorn src.main:app --reload --port 5000

2. Frontend Integration (Koha OPAC)

  1. Log in to the Koha Staff Client.
  2. Navigate to Koha Administration -> Global system preferences -> OPAC -> OPACUserJS.
  3. Paste the contents of the integration script (located in docs/frontend_snippet.js or refer to the JS script provided in the project).
  4. Update the backendUrl variable in the script to point to your deployed FastAPI server (e.g., https://koha.yourdomain.edu/recommender-api/search-enrichment).

3. Production Deployment

For production, it is recommended to run the FastAPI application using Gunicorn managed by Systemd, and served behind an Apache2 reverse proxy to handle SSL/TLS termination and prevent mixed-content errors.

Future Enhancements

  • Semantic Search via NLP: Transition from a rule-based signal strategy to Vector embeddings (Word2Vec/Doc2Vec) using FAISS for true semantic relationship mapping.
  • Multilingual Transliteration: Improve "Did You Mean" support for non-roman scripts using indic-transliteration libraries.
  • Personalization: Integrate individual user borrowing histories to weight recommendations based on user affinity profiles.

See FUTURE_ENHANCEMENTS.md for detailed implementation plans.

About

An intelligent, hybrid recommendation engine and search enrichment middleware for the Koha Library OPAC. Features context-aware "Did You Mean" typo correction and intent-driven "Related Items" suggestions using the Google Books API and circulation data.

Topics

Resources

Contributing

Stars

Watchers

Forks

Contributors

Languages