|
1 |
| -# AI Lab Recipes |
| 1 | +# podman-llm |
2 | 2 |
|
3 |
| -This repo contains recipes for building and running containerized AI and LLM |
4 |
| -Applications with Podman. |
| 3 | +The goal of podman-llm is to make AI even more boring. |
5 | 4 |
|
6 |
| -These containerized AI recipes can be used to help developers quickly prototype |
7 |
| -new AI and LLM based applications locally, without the need for relying on any other |
8 |
| -externally hosted services. Since they are already containerized, it also helps |
9 |
| -developers move quickly from prototype to production. |
| 5 | +## Install |
10 | 6 |
|
11 |
| -## Model servers |
| 7 | +Install podman-llm by running this one-liner: |
12 | 8 |
|
13 |
| -#### What's a model server? |
| 9 | +``` |
| 10 | +curl -fsSL https://raw.githubusercontent.com/ericcurtin/podman-llm/main/install.sh | sudo bash |
| 11 | +``` |
14 | 12 |
|
15 |
| -A model server is a program that serves machine-learning models, such as LLMs, and |
16 |
| -makes their functions available via an API. This makes it easy for developers to |
17 |
| -incorporate AI into their applications. This repository provides descriptions and |
18 |
| -code for building several of these model servers. |
| 13 | +## Usage |
19 | 14 |
|
20 |
| -Many of the sample applications rely on the `llamacpp_python` model server by |
21 |
| -default. This server can be used for various generative AI applications with various models. |
22 |
| -However, each sample application can be paired with a variety of model servers. |
| 15 | +### Running Models |
23 | 16 |
|
24 |
| -Learn how to build and run the llamacpp_python model server by following the |
25 |
| -[llamacpp_python model server README](/model_servers/llamacpp_python/README.md). |
| 17 | +You can run a model using the `run` command. This will start an interactive session where you can query the model. |
26 | 18 |
|
27 |
| -## Current Recipes |
| 19 | +``` |
| 20 | +$ podman-llm run granite |
| 21 | +> Tell me about podman in less than ten words |
| 22 | +A fast, secure, and private container engine for modern applications. |
| 23 | +> |
| 24 | +``` |
28 | 25 |
|
29 |
| -Recipes consist of at least two components: A model server and an AI application. |
30 |
| -The model server manages the model, and the AI application provides the specific |
31 |
| -logic needed to perform some specific task such as chat, summarization, object |
32 |
| -detection, etc. |
| 26 | +### Serving Models |
33 | 27 |
|
34 |
| -There are several sample applications in this repository that can be found in the |
35 |
| -[recipes](./recipes) directory. |
| 28 | +To serve a model via HTTP, use the `serve` command. This will start an HTTP server that listens for incoming requests to interact with the model. |
36 | 29 |
|
37 |
| -They fall under the categories: |
| 30 | +``` |
| 31 | +$ podman-llm serve granite |
| 32 | +... |
| 33 | +{"tid":"140477699799168","timestamp":1719579518,"level":"INFO","function":"main","line":3793,"msg":"HTTP server listening","n_threads_http":"11","port":"8080","hostname":"127.0.0.1"} |
| 34 | +... |
| 35 | +``` |
38 | 36 |
|
39 |
| -* [audio](./recipes/audio) |
40 |
| -* [computer-vision](./recipes/computer_vision) |
41 |
| -* [multimodal](./recipes/multimodal) |
42 |
| -* [natural language processing](./recipes/natural_language_processing) |
| 37 | +## Model library |
43 | 38 |
|
| 39 | +| Model | Parameters | Run | |
| 40 | +| ------------------ | ---------- | ------------------------------ | |
| 41 | +| granite | 3B | `podman-llm run granite` | |
| 42 | +| mistral | 7B | `podman-llm run mistral` | |
| 43 | +| merlinite | 7B | `podman-llm run merlinite` | |
44 | 44 |
|
45 |
| -Learn how to build and run each application by visiting their README's. |
46 |
| -For example, learn how to run the [chatbot recipe here](./recipes/natural_language_processing/chatbot). |
| 45 | +## Containerfile Example |
47 | 46 |
|
48 |
| -## Current AI Lab Recipe images built from this repository |
| 47 | +Here is an example Containerfile: |
49 | 48 |
|
50 |
| -Images for many sample applications and models are available in `quay.io`. All |
51 |
| -currently built images are tracked in |
52 |
| -[ailab-images.md](./ailab-images.md) |
| 49 | +``` |
| 50 | +FROM quay.io/podman-llm/podman-llm:41 |
| 51 | +LABEL model=/granite-3b-code-instruct.Q4_K_M.gguf |
| 52 | +RUN llama-main --hf-repo ibm-granite/granite-3b-code-instruct-GGUF -m granite-3b-code-instruct.Q4_K_M.gguf |
| 53 | +``` |
53 | 54 |
|
54 |
| -## [Training](./training/README.md) |
| 55 | +`LABEL model` is important so we know where to find the .gguf file. |
| 56 | + |
| 57 | +And we build via: |
| 58 | + |
| 59 | +``` |
| 60 | +podman build -t granite podman-llm/granite:3b |
| 61 | +``` |
| 62 | + |
| 63 | +## Diagram |
| 64 | + |
| 65 | +``` |
| 66 | ++------------------------+ +--------------------+ +------------------+ |
| 67 | +| | | Pull runtime layer | | Pull model layer | |
| 68 | +| podman-llm run | -> | with llama.cpp | -> | with granite | |
| 69 | +| | | | | | |
| 70 | ++------------------------+ +--------------------+ |------------------| |
| 71 | + | Repo options: | |
| 72 | + +------------------+ |
| 73 | + | | |
| 74 | + v v |
| 75 | + +--------------+ +---------+ |
| 76 | + | Hugging Face | | quay.io | |
| 77 | + +--------------+ +---------+ |
| 78 | + \ / |
| 79 | + \ / |
| 80 | + \ / |
| 81 | + v v |
| 82 | + +-----------------+ |
| 83 | + | Start container | |
| 84 | + | with llama.cpp | |
| 85 | + | and granite | |
| 86 | + | model | |
| 87 | + +-----------------+ |
| 88 | +``` |
55 | 89 |
|
56 |
| -Linux Operating System Bootable containers enabled for AI Training |
|
0 commit comments