This project implements a Retrieval-Augmented Generation (RAG) pipeline to answer questions about TransFi’s products and solutions.
It uses asynchronous scraping, semantic embeddings, and vector-based retrieval to build a local knowledge base of TransFi’s website content.
✅ Async-first architecture — concurrent scraping & query processing
✅ Website crawler for TransFi’s Products and Solutions pages
✅ Text cleaning & chunking for structured ingestion
✅ FAISS-based semantic search using Sentence Transformers
✅ LLM-powered answer generation (HuggingFace or OpenAI)
✅ Rich metrics logging for both ingestion and query phases
✅ Modular design with utils/ for code reuse (ready for FastAPI Part 2)
python -m venv venv
source venv/bin/activate # macOS/Linux
venv\Scripts\activate # Windowspip install -r requirements.txtNotes (Windows): faiss-cpu installs via pip. If PyTorch wheels fail, install a compatible version from https://pytorch.org and re-run the install.
Description: Scrapes TransFi’s “Products” and “Solutions” pages asynchronously, cleans the text, chunks it, generates embeddings, builds a FAISS index, and logs ingestion metrics.
Run
python ingest.py --url "https://www.transfi.com"Configuration options
- --url: Base site to crawl. Defaults: none (required when run as a script)
- Data dirs:
data/raw,data/clean,data/index(created automatically) - Embedding model:
all-MiniLM-L6-v2(inutils/embedding.py) - Chunking:
max_len=500,overlap=50(inutils/text_processing.py)
=== Ingestion Metrics ===
Total Time: 45.2s
Pages Scraped: 23
Pages Failed: 2
Total Chunks Created: 456
Total Tokens Processed: 125,340
Indexing Time: 2.1s
Average Scraping Time per Page: 1.8s
Errors: NoneDescription: Retrieves relevant text chunks from the index, generates answers using an LLM, cites sources, and logs detailed query metrics.
Run (single question)
python query.py --question "What is BizPay and its key features?"Run (batch questions)
python query.py --questions questions.txtRun (concurrent batch)
python query.py --questions questions.txt --concurrentConfiguration options
- --question: Single question string
- --questions: Path to a text file (one question per line)
- --concurrent: Run multiple questions concurrently
- Index dir:
data/index(inquery.py) - LLM:
gpt2text-generation viatransformers(inutils/llm_utils.py)
Question: What is BizPay and its key features?
Answer:
BizPay enables businesses to process seamless cross-border payments...
Sources:
1. BizPay - https://www.transfi.com/products/bizpay
Snippet: "BizPay enables businesses to..."
2. Solutions Overview - https://www.transfi.com/solutions
Snippet: "Key features include..."
=== Query Metrics ===
Total Latency: 2.4s
Retrieval Time: 0.3s
LLM Time: 2.0s
Documents Retrieved: 5
Documents Used in Answer: 2This adds a REST API around the RAG pipeline and a simple webhook receiver to demonstrate async callbacks.
- Use the same virtual environment and dependencies from Part 1
- Ensure the FAISS index exists (run
ingest.pyat least once) before using query endpoints
# Terminal 1
python webhook_receiver.py --port 8001
# Terminal 2
uvicorn api:app --port 8000
# Terminal 3 (trigger ingestion with webhook callback)
curl -X POST http://localhost:8000/api/ingest \
-H "Content-Type: application/json" \
-d '{"urls": ["https://www.transfi.com"], "callback_url": "http://localhost:8001/webhook"}'You should see a webhook payload printed in the Terminal 1 window when ingestion completes.
- POST
/api/ingest— body:{ "urls": ["https://..."] , "callback_url": "http://.../webhook" }- Triggers background ingestion; immediately returns
{ "message": "Ingestion started" } - If
callback_urlis provided, a completion payload is POSTed to it
- Triggers background ingestion; immediately returns
- POST
/api/query— body:{ "question": "..." }- Returns
{ question, answer, sources[], metrics }
- Returns
- POST
/api/query/batch— body:{ "questions": ["...", "..."], "callback_url": "http://.../webhook" }- Executes questions concurrently; returns results and optionally sends them to
callback_url
- Executes questions concurrently; returns results and optionally sends them to
# 1) Start webhook receiver (Terminal 1)
python webhook_receiver.py --port 8001
# 2) Start API (Terminal 2)
uvicorn api:app --port 8000
# 3) Kick off ingestion (Terminal 3)
curl -X POST http://localhost:8000/api/ingest \
-H "Content-Type: application/json" \
-d '{"urls": ["https://www.transfi.com"], "callback_url": "http://localhost:8001/webhook"}'
# 4) After ingestion completes, ask a question via API (Terminal 3)
curl -X POST http://localhost:8000/api/query \
-H "Content-Type: application/json" \
-d '{"question": "What is BizPay?"}'
# 5) Or batch query with optional webhook (Terminal 3)
curl -X POST http://localhost:8000/api/query/batch \
-H "Content-Type: application/json" \
-d '{"questions": ["What is BizPay?", "What are TransFi payouts?"], "callback_url": "http://localhost:8001/webhook"}'- Server port:
uvicorn api:app --port 8000 - Index dir:
data/index(configured inapi.py) - Webhook timeout: 15s (in
api.pysend_webhook) - Retrieval top_k: 3 (in
api.pyprocess_question)
{
"metrics": {
"status": "completed",
"total_time": 42.13,
"urls": ["https://www.transfi.com"]
}
}