Skip to content

Commit 0d2dd9a

Browse files
committed
Baseline
0 parents  commit 0d2dd9a

20 files changed

+628
-0
lines changed

.gitignore

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
*.pyc
2+
.DS_Store
3+
backup
4+
chroma_boardgames/
5+
env/
6+
.env
7+
__pycache__

README.md

Lines changed: 75 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
# Simple Local RAG
2+
3+
A basic local LLM RAG Chatbot With LangChain that exposes itself via REST endpoints.
4+
5+
## Setup
6+
7+
Clone this repository and create a clean python v3.10 virtual environment and activate it.
8+
9+
#### Dependencies
10+
11+
The following is a high-level list of components used to run this local RAG:
12+
13+
- langchain
14+
- streamlit
15+
- streamlit-chat
16+
- pypdf
17+
- chromadb
18+
- fastembed
19+
20+
```
21+
pip install -r requirements.txt
22+
```
23+
24+
#### Setup up Ollama
25+
26+
This depends on the Ollama platform to run the LLM locally. The setup is straightforward. First, visit ollama.ai and download the app appropriate for your operating system.
27+
28+
Next open your terminal and execute the following command to pull the latest Mistral model.
29+
30+
```
31+
ollama pull llama3
32+
```
33+
34+
#### Configuration
35+
36+
Create a `.env` file in the root directory and add the following environment variables:
37+
38+
```.env
39+
40+
CHROMA_PATH=chroma_boardgames
41+
DATA_PATH_BG=data_boardgames
42+
```
43+
44+
#### Build the Vector Store
45+
46+
The `populate_database.py` loads any PDF files it finds in the `DATA_PATH_BG` folder. The repository currently includes a couple of example board game instruction manuals to seed a Chroma Vector store. The module reads the folder and loads each of the PDF's it into vector storage in two steps: first, it splits the document into smaller chunks to accommodate the token limit of the LLM; second, it vectorizes these chunks using FastEmbeddings and stores them into Chroma. It will generate a chunk ID that will indicate which PDF file, page number and chunk number of the embedding. This allows us to analyze how the model is producing a response, but also allows us to incrementally add new data to the database without have to fully reload it. Run the database load:
47+
48+
` python -m populate_database.py`
49+
50+
If you need to clear the database for any reason, run:
51+
52+
`python -m reset_database.py`
53+
54+
The above command will remove the chroma database. If you need to recreate it, simply rerun `populate_database.py`
55+
56+
## Running the RAG
57+
58+
The instruction manuals for both Monopoly and Ticket To Ride have been loaded into the Chroma DB. Ask the RAG questions about these two board games and see how well it does answering your questions. The RAG can be invoked using the following command with the sample question:
59+
60+
```
61+
python query_data.py How do I build a hotel in monopoly?
62+
```
63+
64+
Here are some additional questions you can try:
65+
66+
- How much total money does a player start with in Monopoly? (Answer with the number only)
67+
- How many points does the longest continuous train get in Ticket to Ride? (Answer with the number only)
68+
69+
You can also browse the instruction manuals that are in the `./data_boardgames` folder to come up with your own questions.
70+
71+
## Running the test cases
72+
73+
```
74+
pytest test_rag.py
75+
```

api.py

Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
import logging
2+
from query_data import query_rag
3+
from fastapi import FastAPI, Request, HTTPException, status
4+
from fastapi.exceptions import RequestValidationError
5+
from fastapi.responses import JSONResponse
6+
from pydantic import BaseModel
7+
from models.rag_query_model import QueryInput, QueryOutput
8+
from utils.async_utils import async_retry
9+
10+
11+
class Message(BaseModel):
12+
""" Message class defined in Pydantic """
13+
channel: str
14+
author: str
15+
text: str
16+
17+
18+
app = FastAPI(
19+
title="PDF Document Chatbot",
20+
description="Endpoints for various PDF documents",
21+
)
22+
23+
channel_list = ["general", "dev", "marketing"]
24+
message_map = {}
25+
26+
27+
@app.exception_handler(RequestValidationError)
28+
async def validation_exception_handler(request: Request, exc: RequestValidationError):
29+
exc_str = f'{exc}'.replace('\n', ' ').replace(' ', ' ')
30+
logging.error(f"{request}: {exc_str}")
31+
content = {'status_code': 10422, 'message': exc_str, 'data': None}
32+
return JSONResponse(content=content, status_code=status.HTTP_422_UNPROCESSABLE_ENTITY)
33+
34+
35+
@async_retry(max_retries=10, delay=1)
36+
async def invoke_agent_with_retry(query: str):
37+
"""
38+
Retry the agent if a tool fails to run. This can help when there
39+
are intermittent connection issues to external APIs.
40+
"""
41+
42+
return await query_rag({"input": query})
43+
44+
45+
@app.get("/")
46+
async def get_status():
47+
return {"status": "running"}
48+
49+
50+
@app.post("/post_message", status_code=status.HTTP_201_CREATED)
51+
def post_message(message: Message):
52+
"""Post a new message to the specified channel."""
53+
channel = message.channel
54+
if channel in channel_list:
55+
# message_map[channel].append(message)
56+
return message
57+
else:
58+
raise HTTPException(status_code=404, detail="channel not found")
59+
60+
61+
@app.post("/rag-query")
62+
async def query_rag_api(query: QueryInput):
63+
print(f"api.py - API Request Data: {query}")
64+
query_response = query_rag({"input": query})
65+
print(query_response)
66+
67+
# query: str
68+
# response: str
69+
# sources: list[str]
70+
query_response2 = {
71+
"query": query, "response": query_response["response"], "sources": query_response["sources"]}
72+
print(f"Query Response2: {query_response2}")
73+
74+
# query_response["intermediate_steps"] = [
75+
# str(s) for s in query_response["intermediate_steps"]
76+
# ]
77+
78+
return query_response2
79+
80+
81+
@app.post("/rag-query2")
82+
async def query_rag_api2(
83+
query: QueryInput,
84+
) -> QueryOutput:
85+
query_response = query_rag({"input": query})
86+
print(query_response)
87+
88+
# query: str
89+
# response: str
90+
# sources: list[str]
91+
query_text = query["query"]
92+
query_response2 = {
93+
"query": query_text, "response": query_response["response"], "sources": query_response["sources"]}
94+
print(f"Query Response2: {query_response2}")
95+
96+
# query_response["intermediate_steps"] = [
97+
# str(s) for s in query_response["intermediate_steps"]
98+
# ]
99+
100+
return query_response2

data_boardgames/monopoly.pdf

611 KB
Binary file not shown.

data_boardgames/ticket_to_ride.pdf

3.4 MB
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.

0 commit comments

Comments
 (0)