Skip to content

Commit 030cf01

Browse files
committed
Graph RAG recipe
Signed-off-by: Sanjeev Rampal <[email protected]>
1 parent 2b09302 commit 030cf01

File tree

17 files changed

+980
-0
lines changed

17 files changed

+980
-0
lines changed
Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
SHELL := /bin/bash
2+
APP ?= grag
3+
PORT ?= 8501
4+
CHROMADB_PORT ?= 8000
5+
6+
include ../../common/Makefile.common
7+
8+
RECIPE_BINARIES_PATH := $(shell realpath ../../common/bin)
9+
RELATIVE_MODELS_PATH := ../../../models
10+
RELATIVE_TESTS_PATH := ../tests
11+
12+
.PHONY: run-chromadb
13+
run:
14+
podman run -it -p $(CHROMADB_PORT):$(CHROMADB_PORT) -e CHROMADB_ENDPOINT=http://10.88.0.1:8000/v1 ${CHROMADB_IMAGE}
15+
16+
# rag requires custom bootc because it uses an extra build-arg for CHROMADB_IMAGE (other apps use ../../common/Makefile.common target)
17+
.PHONY: bootc
18+
bootc: quadlet
19+
"${CONTAINER_TOOL}" build \
20+
$(ARCH:%=--arch %) \
21+
$(BUILD_ARG_FILE:%=--build-arg-file=%) \
22+
$(FROM:%=--from %) \
23+
$(AUTH_JSON:%=-v %:/run/containers/0/auth.json) \
24+
--security-opt label=disable \
25+
--cap-add SYS_ADMIN \
26+
--build-arg MODEL_IMAGE=$(MODEL_IMAGE) \
27+
--build-arg APP_IMAGE=$(APP_IMAGE) \
28+
--build-arg CHROMADB_IMAGE=$(CHROMADB_IMAGE) \
29+
--build-arg SERVER_IMAGE=$(SERVER_IMAGE) \
30+
--build-arg "SSHPUBKEY=$(SSH_PUBKEY)" \
31+
-f bootc/$(CONTAINERFILE) \
32+
-t ${BOOTC_IMAGE} .
33+
@echo ""
34+
@echo "Successfully built bootc image '${BOOTC_IMAGE}'."
35+
@echo "You may now convert the image into a disk image via bootc-image-builder"
36+
@echo "or the Podman Desktop Bootc Extension. For more information, please refer to"
37+
@echo " * https://github.com/osbuild/bootc-image-builder"
38+
@echo " * https://github.com/containers/podman-desktop-extension-bootc"
39+
40+
# rag requires custom quadlet target for CHROMADB_IMAGE substitution
41+
# (other apps use ../../common/Makefile.common target)
42+
.PHONY: quadlet
43+
quadlet:
44+
# Modify quadlet files to match the server, model and app image
45+
mkdir -p build
46+
sed -e "s|SERVER_IMAGE|${SERVER_IMAGE}|" \
47+
-e "s|APP_IMAGE|${APP_IMAGE}|g" \
48+
-e "s|MODEL_IMAGE|${MODEL_IMAGE}|g" \
49+
-e "s|CHROMADB_IMAGE|${CHROMADB_IMAGE}|g" \
50+
-e "s|APP|${APP}|g" \
51+
quadlet/${APP}.image \
52+
> build/${APP}.image
53+
sed -e "s|SERVER_IMAGE|${SERVER_IMAGE}|" \
54+
-e "s|APP_IMAGE|${APP_IMAGE}|g" \
55+
-e "s|MODEL_IMAGE|${MODEL_IMAGE}|g" \
56+
-e "s|CHROMADB_IMAGE|${CHROMADB_IMAGE}|g" \
57+
quadlet/${APP}.yaml \
58+
> build/${APP}.yaml
59+
cp quadlet/${APP}.kube build/${APP}.kube
60+
61+
# rag requires custom bootc-run because it uses an extra port for chromadb
62+
# (other apps use ../../common/Makefile.common target)
63+
.PHONY: bootc-run
64+
bootc-run:
65+
podman run -d --rm --name $(APP)-bootc -p 8080:8501 -p 8090:8000 --privileged \
66+
$(AUTH_JSON:%=-v %:/run/containers/0/auth.json) \
67+
$(BOOTC_IMAGE) /sbin/init
Lines changed: 211 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,211 @@
1+
# Graph RAG (Retrieval Augmented Generation) Chat Application
2+
3+
.. THIS IS A WORK IN PROGRESS CURRENTLY. DO NOT USE YET ..
4+
5+
This demo provides a simple recipe to help developers start to build out their own custom Graph RAG (Graph Retrieval Augmented Generation) applications. It consists of three main components; the Model Service, the Graph Database and the AI Application.
6+
7+
There are a few options today for local Model Serving, but this recipe will use [`llama-cpp-python`](https://github.com/abetlen/llama-cpp-python) and their OpenAI compatible Model Service. There is a Containerfile provided that can be used to build this Model Service within the repo, [`model_servers/llamacpp_python/base/Containerfile`](/model_servers/llamacpp_python/base/Containerfile).
8+
9+
In order for the LLM to interact with our documents, we need them stored and available in such a manner that we can retrieve a small subset of them that are relevant to our query. To do this we employ a Graph Database alongside an embedding model. We convert these documents into a Graph database representation which is then stored in the Graph database. This graph structure has better semantic capture properties of the contents of the input documents than basic RAG, including the ability to extract logical entities and their relationships from the document. The Graph database also supports vector based indexing of the graph structure to allow it to be integrated with RAG prompt chaining libraries. In this recipe we use [neo4j](https://neo4j.com/product/neo4j-graph-database/) as our Graph Database.
10+
11+
Our AI Application will connect to our Model Service via it's OpenAI compatible API. In this example we rely on [Langchain's](https://python.langchain.com/docs/get_started/introduction) python package to simplify communication with our Model Service and we use [Streamlit](https://streamlit.io/) for our UI layer. Below please see an example of the RAG application.
12+
13+
![](/assets/rag_ui.png)
14+
15+
16+
## Try the RAG chat application
17+
18+
_COMING SOON to AI LAB_
19+
The [Podman Desktop](https://podman-desktop.io) [AI Lab Extension](https://github.com/containers/podman-desktop-extension-ai-lab) will (in future, once completed) include this recipe among others. To try it out, open `Recipes Catalog` -> `RAG Chatbot` and follow the instructions to start the application.
20+
21+
If you prefer building and running the application from terminal, please run the following commands from this directory.
22+
23+
First, build application's meta data and run the generated Kubernetes YAML which will spin up a Pod along with a number of containers:
24+
```
25+
make quadlet
26+
podman kube play build/grag.yaml
27+
```
28+
29+
The Pod is named `grag`, so you may use [Podman](https://podman.io) to manage the Pod and its containers:
30+
```
31+
podman pod list
32+
podman ps
33+
```
34+
35+
To stop and remove the Pod, run:
36+
```
37+
podman pod stop grag
38+
podman pod rm grag
39+
```
40+
41+
Once the Pod is running, please refer to the section below to [interact with the RAG chatbot application](#interact-with-the-ai-application).
42+
43+
# Build the Application
44+
45+
In order to build this application we will need two models, a Graph Database, a Model Service and an AI Application.
46+
47+
* [Download models](#download-models)
48+
* [Deploy the Graph Database](#deploy-the-graph-database)
49+
* [Build the Model Service](#build-the-model-service)
50+
* [Deploy the Model Service](#deploy-the-model-service)
51+
* [Build the AI Application](#build-the-ai-application)
52+
* [Deploy the AI Application](#deploy-the-ai-application)
53+
* [Interact with the AI Application](#interact-with-the-ai-application)
54+
55+
### Download models
56+
57+
If you are just getting started, we recommend using [Granite-7B-Lab](https://huggingface.co/instructlab/granite-7b-lab-GGUF). This is a well
58+
performant mid-sized model with an apache-2.0 license that has been quanitzed and served into the [GGUF format](https://github.com/ggerganov/ggml/blob/master/docs/gguf.md).
59+
60+
The recommended model can be downloaded using the code snippet below:
61+
62+
```bash
63+
cd ../../../models
64+
curl -sLO https://huggingface.co/instructlab/granite-7b-lab-GGUF/resolve/main/granite-7b-lab-Q4_K_M.gguf
65+
cd ../recipes/natural_language_processing/rag
66+
```
67+
68+
_A full list of supported open models is forthcoming._
69+
70+
In addition to the LLM, RAG applications also require an embedding model to convert documents between natural language and vector representations. For this demo we will use [`BAAI/bge-base-en-v1.5`](https://huggingface.co/BAAI/bge-base-en-v1.5) it is a fairly standard model for this use case and has an MIT license.
71+
72+
The code snippet below can be used to pull a copy of the `BAAI/bge-base-en-v1.5` embedding model and store it in your `models/` directory.
73+
74+
```python
75+
from huggingface_hub import snapshot_download
76+
snapshot_download(repo_id="BAAI/bge-base-en-v1.5",
77+
cache_dir="models/",
78+
local_files_only=False)
79+
```
80+
81+
### Deploy the Graph Database
82+
83+
To deploy the Graph Database service locally, simply use the existing Neo4j image. The Graph Database is ephemeral and will need to be re-populated each time the container restarts. When implementing RAG/ Graph RAG in production, you will want a long running and backed up Graph Database.
84+
85+
86+
#### Neo4j
87+
```bash
88+
podman run \
89+
--restart always \
90+
--publish=7474:7474 --publish=7687:7687 --env NEO4J_AUTH=none \
91+
neo4j
92+
```
93+
94+
### Build the Model Service
95+
96+
The complete instructions for building and deploying the Model Service can be found in the [the llamacpp_python model-service document](../model_servers/llamacpp_python/README.md).
97+
98+
The Model Service can be built with the following code snippet:
99+
100+
```bash
101+
cd model_servers/llamacpp_python
102+
podman build -t llamacppserver -f ./base/Containerfile .
103+
```
104+
105+
106+
### Deploy the Model Service
107+
108+
The complete instructions for building and deploying the Model Service can be found in the [the llamacpp_python model-service document](../model_servers/llamacpp_python/README.md).
109+
110+
The local Model Service relies on a volume mount to the localhost to access the model files. You can start your local Model Service using the following Podman command:
111+
```
112+
podman run --rm -it \
113+
-p 8001:8001 \
114+
-v Local/path/to/locallm/models:/locallm/models \
115+
-e MODEL_PATH=models/<model-filename> \
116+
-e HOST=0.0.0.0 \
117+
-e PORT=8001 \
118+
llamacppserver
119+
```
120+
121+
### Build the AI Application
122+
123+
Now that the Model Service is running we want to build and deploy our AI Application. Use the provided Containerfile to build the AI Application image in the `rag-langchain/` directory.
124+
125+
```bash
126+
cd rag
127+
make APP_IMAGE=grag build
128+
```
129+
130+
### Deploy the AI Application
131+
132+
Make sure the Model Service and the Graph Database are up and running before starting this container image. When starting the AI Application container image we need to direct it to the correct `MODEL_ENDPOINT`. This could be any appropriately hosted Model Service (running locally or in the cloud) using an OpenAI compatible API. In our case the Model Service is running inside the Podman machine so we need to provide it with the appropriate address `10.88.0.1`. The same goes for the Vector Database. Make sure the `GRAPHDB_HOST` is correctly set to `10.88.0.1` for communication within the Podman virtual machine.
133+
134+
There also needs to be a volume mount into the `models/` directory so that the application can access the embedding model as well as a volume mount into the `data/` directory where it can pull documents from to populate the Vector Database.
135+
136+
The following Podman command can be used to run your AI Application:
137+
138+
```bash
139+
podman run --rm -it -p 8501:8501 \
140+
-e MODEL_ENDPOINT=http://10.88.0.1:8001 \
141+
-e GRAPHDB_HOST=10.88.0.1 \
142+
-v Local/path/to/locallm/models/:/rag/models \
143+
grag
144+
```
145+
146+
### Interact with the AI Application
147+
148+
Everything should now be up an running with the rag application available at [`http://localhost:8501`](http://localhost:8501). By using this recipe and getting this starting point established, users should now have an easier time customizing and building their own LLM enabled RAG applications.
149+
150+
### Embed the AI Application in a Bootable Container Image
151+
152+
To build a bootable container image that includes this sample RAG chatbot workload as a service that starts when a system is booted, cd into this folder
153+
and run:
154+
155+
156+
```
157+
make BOOTC_IMAGE=quay.io/your/rag-bootc:latest bootc
158+
```
159+
160+
Substituting the bootc/Containerfile FROM command is simple using the Makefile FROM option.
161+
162+
```
163+
make FROM=registry.redhat.io/rhel9/rhel-bootc:9.4 BOOTC_IMAGE=quay.io/your/rag-bootc:latest bootc
164+
```
165+
166+
The magic happens when you have a bootc enabled system running. If you do, and you'd like to update the operating system to the OS you just built
167+
with the RAG chatbot application, it's as simple as ssh-ing into the bootc system and running:
168+
169+
```
170+
bootc switch quay.io/your/rag-bootc:latest
171+
```
172+
173+
Upon a reboot, you'll see that the RAG chatbot service is running on the system.
174+
175+
Check on the service with
176+
177+
```
178+
ssh user@bootc-system-ip
179+
sudo systemctl status rag
180+
```
181+
182+
#### What are bootable containers?
183+
184+
What's a [bootable OCI container](https://containers.github.io/bootc/) and what's it got to do with AI?
185+
186+
That's a good question! We think it's a good idea to embed AI workloads (or any workload!) into bootable images at _build time_ rather than
187+
at _runtime_. This extends the benefits, such as portability and predictability, that containerizing applications provides to the operating system.
188+
Bootable OCI images bake exactly what you need to run your workloads into the operating system at build time by using your favorite containerization
189+
tools. Might I suggest [podman](https://podman.io/)?
190+
191+
Once installed, a bootc enabled system can be updated by providing an updated bootable OCI image from any OCI
192+
image registry with a single `bootc` command. This works especially well for fleets of devices that have fixed workloads - think
193+
factories or appliances. Who doesn't want to add a little AI to their appliance, am I right?
194+
195+
Bootable images lend toward immutable operating systems, and the more immutable an operating system is, the less that can go wrong at runtime!
196+
197+
##### Creating bootable disk images
198+
199+
You can convert a bootc image to a bootable disk image using the
200+
[quay.io/centos-bootc/bootc-image-builder](https://github.com/osbuild/bootc-image-builder) container image.
201+
202+
This container image allows you to build and deploy [multiple disk image types](../../common/README_bootc_image_builder.md) from bootc container images.
203+
204+
Default image types can be set via the DISK_TYPE Makefile variable.
205+
206+
`make bootc-image-builder DISK_TYPE=ami`
207+
208+
### Makefile variables
209+
210+
There are several [Makefile variables](../../common/README.md) defined within each `recipe` Makefile which can be
211+
used to override defaults for a variety of make targets.
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
version: v1.0
2+
application:
3+
type: language
4+
name: rag-demo
5+
description: A RAG chat bot using local documents.
6+
containers:
7+
- name: llamacpp-server
8+
contextdir: ../../../model_servers/llamacpp_python
9+
containerfile: ./base/Containerfile
10+
model-service: true
11+
backend:
12+
- llama-cpp
13+
arch:
14+
- arm64
15+
- amd64
16+
ports:
17+
- 8001
18+
image: quay.io/ai-lab/llamacpp_python:latest
19+
- name: chromadb-server
20+
contextdir: ../../../vector_dbs/chromadb
21+
containerfile: Containerfile
22+
vectordb: true
23+
arch:
24+
- arm64
25+
- amd64
26+
ports:
27+
- 8000
28+
image: quay.io/ai-lab/chromadb:latest
29+
- name: rag-inference-app
30+
contextdir: app
31+
containerfile: Containerfile
32+
arch:
33+
- arm64
34+
- amd64
35+
ports:
36+
- 8501
37+
image: quay.io/ai-lab/rag:latest
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
FROM registry.access.redhat.com/ubi9/python-311:1-72.1722518949
2+
USER root
3+
ENV LD_LIBRARY_PATH="/usr/local/lib"
4+
####
5+
WORKDIR /rag
6+
COPY requirements.txt .
7+
RUN pip install --upgrade pip
8+
RUN pip install --no-cache-dir --upgrade -r /rag/requirements.txt
9+
COPY rag_app.py .
10+
COPY manage_graphdb.py .
11+
EXPOSE 8501
12+
ENV HF_HUB_CACHE=/rag/models/
13+
RUN mkdir -p /rag/models/
14+
RUN chgrp -R 0 /rag/models/ && chmod -R g=u /rag/models/
15+
ENTRYPOINT [ "streamlit", "run" ,"rag_app.py" ]

0 commit comments

Comments
 (0)