RAG LLM Demo UI

A simple Gradio app that demonstrates a configurable Retrieval-Augmented Generation (RAG) flow. Pick an LLM, a vector database, tweak retrieval parameters, and inspect the RAG internals.

RAG LLM Demo UI

Features

Multiple Vector Databases: Connect to Qdrant, Redis, Elasticsearch, pgvector, or SQL Server.
Any OpenAI-compatible LLM: Provide one or more base URLs, and the UI will auto-discover available models.
Streaming Responses: Get incremental updates from the assistant for a real-time chat experience.
Inspectable RAG Internals: View the retrieved documents and the exact prompt sent to the LLM to understand the context.

How it Works

The application follows a standard RAG pattern:

User Query: The user asks a question in the chat interface.
Document Retrieval: The app converts the query into an embedding and searches the selected vector database to find the most relevant document chunks.
Prompt Augmentation: A prompt is constructed for the LLM, containing the original user query and the content of the retrieved documents as context.
LLM Generation: The augmented prompt is sent to the selected LLM, which generates a response based on the provided context.
Stream to UI: The response is streamed back to the user interface.

Requirements

Conda: Recommended for managing local Python environments.
Podman or Docker: Required for running the application as a container.
ODBC Driver 18 for SQL Server: Required only if you are developing locally (not in a container) and need to connect to SQL Server. The container image includes this driver.

Quickstart (Local)

Set up the Environment: Create and activate the Conda environment.

conda env create -f environment.yaml
conda activate rag-llm-demo-ui

Configure your settings: Copy the example environment file.
```
cp .env.example .env
```
Now you can edit .env with your LLM and DB provider URLs/credentials. This file is explicitly ignored in the .gitignore. Even so, avoid adding any sensitive credentials to this file as a best practice.

You can also export the variables directly in your shell:
```
export LOG_LEVEL="INFO"
export LLM_URLS='["http://localhost:1234/v1"]'
export DB_PROVIDERS='[{"type": "qdrant", "collection": "docs", "url": "http://localhost:6333", "embedding_model": "sentence-transformers/all-mpnet-base-v2"}]'
export HOST="0.0.0.0"
export PORT="7860"
```
Run the app:
```
make run
```
The UI will be available at http://localhost:7860.

Quickstart (Container)

Build the container image:
```
make build
```

Run the container: The following command runs the container and maps it to your host's network, which is convenient for connecting to other services running on localhost (like an LLM or database).

podman run --rm --network host \
  -e LOG_LEVEL="INFO" \
  -e LLM_URLS='["http://localhost:1234/v1"]' \
  -e DB_PROVIDERS='[{"type":"qdrant","collection":"docs","url":"http://localhost:6333","embedding_model":"sentence-transformers/all-mpnet-base-v2"}]' \
  localhost/rag-llm-demo-ui:latest

The UI will be available at http://localhost:7860.

Configuration

The application is configured via environment variables.

LOG_LEVEL: Logging level (DEBUG, INFO, WARNING, ERROR).
LLM_URLS: A JSON array of base URLs for OpenAI-compatible APIs. The app will query the /models and /chat/completions endpoints.
DB_PROVIDERS: A JSON array where each object defines a connection to a vector database.

Provider Examples

Below are examples for the DB_PROVIDERS variable.

Qdrant

{
  "type": "qdrant",
  "collection": "docs",
  "url": "http://localhost:6333",
  "embedding_model": "sentence-transformers/all-mpnet-base-v2"
}

Redis

{
  "type": "redis",
  "index": "docs",
  "url": "redis://localhost:6379",
  "embedding_model": "sentence-transformers/all-mpnet-base-v2"
}

Elasticsearch

{
  "type": "elastic",
  "index": "docs",
  "url": "http://localhost:9200",
  "user": "elastic",
  "password": "changeme",
  "embedding_model": "sentence-transformers/all-mpnet-base-v2"
}

pgvector (PostgreSQL)

{
  "type": "pgvector",
  "collection": "docs",
  "url": "postgresql+psycopg://user:pass@localhost:5432/mydb",
  "embedding_model": "sentence-transformers/all-mpnet-base-v2"
}

SQL Server

{
  "type": "mssql",
  "table": "docs",
  "connection_string": "Driver={ODBC Driver 18 for SQL Server}; Server=localhost,1433; Database=embeddings; UID=sa; PWD=StrongPassword!; TrustServerCertificate=yes; Encrypt=no;",
  "embedding_model": "sentence-transformers/all-mpnet-base-v2"
}

Environment Loading

For local development, the app uses python-dotenv to load variables from a .env file.
Variables set directly in the OS environment will always take precedence over those in the .env file.
SECURITY: Do not commit secrets or credentials into .env files in your Git repository. Use .gitignore to exclude your local .env file. For production, inject these variables using your platform's secrets management tools.

Makefile Targets

Common development tasks are available as make commands:

make install-deps: Install pinned dependencies from requirements.txt.
make update-deps: Re-resolve and update the requirements.txt lockfile (requires pip-tools).
make run: Run the Gradio app locally with python app.py.
make build: Build the container image using Podman.
make upload: Push the container image to a registry.

License

This project is licensed under the Apache Version 2.0 License.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.github		.github
helm		helm
ui		ui
vector_db		vector_db
.env.example		.env.example
.gitignore		.gitignore
Containerfile		Containerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
app.py		app.py
app_config.py		app_config.py
environment.yaml		environment.yaml
llm.py		llm.py
requirements.in		requirements.in
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RAG LLM Demo UI

Features

How it Works

Requirements

Quickstart (Local)

Quickstart (Container)

Configuration

Provider Examples

Environment Loading

Makefile Targets

License

About

Uh oh!

Releases

Packages

Languages

License

validatedpatterns-sandbox/rag-llm-demo-ui

Folders and files

Latest commit

History

Repository files navigation

RAG LLM Demo UI

Features

How it Works

Requirements

Quickstart (Local)

Quickstart (Container)

Configuration

Provider Examples

Environment Loading

Makefile Targets

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages