GitHub - mkiesel/llamacpp-webui-glue

Description

This repo links llama.cpp and open-webui in single docker-compose.yml project with simple python API in the middle to launch/kill llama-server instances on demand. The api has bare-bones VRAM management too so it will remove older models from VRAM when there's not enough for new models. There is no such functionality for RAM though.

Services:

webui is a stock docker image from open-webui
openresty is a stock docker image for openresty to proxy openAI api requests from webui to api. Some functionality is implemented as LUA scripts
llamacpp for launching multiple llama-server instances through python Flask API. Dockerfile for this service is based on official llamacpp dockerfile + python for the API.
mcpo is a stock docker image from mcpo to expose MCP servers to webui. Example config file with date/time tool is provided in ./mcpo. Each tool has to be manually added in webui settings and enabled for each model separately. For example add the default time tool like that (Admin->Settings->Tools):
- url: http://mcpo:8000/time
- auth: key
- name: time
then enable this tool in each model(s) settings.

Quick start

It's developed and tested on linux. Not tested on other OSs

Get the Code

git clone --recurse-submodules https://github.com/mpazdzioch/llamacpp-webui-glue.git
cd llamacpp-webui-glue

1. Set env vars in .env file

Copy the template file to .env:

cp env_template .env

Edit the .env file filling all missing values for your system. In most cases pointing the MODEL_DIR to your gguf files folder is all you need to do.

2. Run the project

This step may take a while because docker images have to be built. First launch of webui also does some init stuff.

docker compose up

or for CPU-only inference

docker compose -f docker-compose-cpu.yml up

The webui interface will be available on localhost:7777. If the ui loads but there's no models to choose from the dropdown list, check the docker compose logs. Both webui and llama instances log everything to stdout of docker compose.

Common problems

There are no models to choose from the dropdown list in webui:
- Check the logs from llamacpp container for details. For some reason llama-server failed to load the model(s)

Model configuration details

All possible YML config file options described here
You only need folder with .gguf file(s) to use this project. Settings from ./model-config/defaul-config.yml will be aplied to all models. This file contains default llama-server cli options.
However, when you want to define custom llama-server options for some models, for example custom GPU split or context size or anything else that llama-server allows using cli options - create a .yml file in ./model-config. When your model file is named codestral:22b-v0.1-q8_0.gguf, create ./model-config/codestral:22b-v0.1-q8_0.gguf.yml and options from this file will be automatically used when launching this model.

Example codestral:22b-v0.1-q8_0.gguf.yml:
```
file: /home/models/codestral:22b-v0.1-q8_0.gguf
llama-server:
  --seed: 1337
  --temp: 0
  --ctx-size: 4096
  --mlock: no_value_flag
```
- file points to .gguf path
- llama-server is simply a list of cli args to pass to llama-server. For options that have no value we use no_value_flag.
- To get available cli args:
```
docker compose run llamacpp /llama-server -h
```

Folders

data
- llamacpp-logs: log files for each running llama-server process
- restylogs: logs from openresty proxy and LUA scripts
- webui-data: data folder for webui
llamacpp
- app: python API for starting/stopping llama-server instances
- llama.cpp: llama.cpp github repo as submodule. Whenever you want to update to newer version of llamacpp - just pull inside this repo and docker compose build llamacpp
openresty: nginx configuration and couple of LUA scripts to proxy openAI api requests to flask API
model-config: we keep all .yml files with custom model configs here
mcpo: stores MCPO server config file

Scripts

scan_model_dirs.py

docker compose run llamacpp python3 scan_model_dirs.py

This script will look for .gguf files in path set in MODEL_DIR env var in docker-compose.yml It will generate .yml file for each .gguf file so it's easier to add custom llamacpp configuration for each model. All files are saved to ./model-config folder.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Description

Quick start

Get the Code

1. Set env vars in .env file

2. Run the project

Common problems

Model configuration details

Folders

Scripts

scan_model_dirs.py

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
certs		certs
data		data
llamacpp		llamacpp
mcpo		mcpo
model-config		model-config
openresty		openresty
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
docker-compose-cpu.yml		docker-compose-cpu.yml
docker-compose.yml		docker-compose.yml
env_template		env_template
readme.md		readme.md

License

mkiesel/llamacpp-webui-glue

Folders and files

Latest commit

History

Repository files navigation

Description

Quick start

Get the Code

1. Set env vars in .env file

2. Run the project

Common problems

Model configuration details

Folders

Scripts

scan_model_dirs.py

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages