Skip to content

Commit 5598f91

Browse files
committed
major version update
1 parent 2f77416 commit 5598f91

22 files changed

+573
-1372
lines changed

.gitignore

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,15 @@ _generate/
77
*.egg-info
88
*env*
99

10+
mtests
11+
12+
13+
# custom
14+
.idea
15+
_docs
16+
_examples
17+
src/.idea
18+
1019
# Byte-compiled / optimized / DLL files
1120
__pycache__/
1221
*.py[cod]
@@ -139,10 +148,3 @@ dmypy.json
139148

140149
# vscode
141150
.vscode
142-
143-
# custom
144-
mtests
145-
.idea
146-
_docs
147-
_examples
148-
src/.idea

CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -pthread")
66

77
add_subdirectory(pybind11)
88
add_subdirectory(llama.cpp)
9+
# add_subdirectory(ggml)
910

1011
file (GLOB CPP_FILES "llama.cpp/*.cpp")
1112
file (GLOB C_FILES "llama.cpp/*.c")

README.md

Lines changed: 119 additions & 52 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
# PyLLaMACpp
2-
Official supported Python bindings for [llama.cpp](https://github.com/ggerganov/llama.cpp) + gpt4all
2+
3+
Python bindings for [llama.cpp](https://github.com/ggerganov/llama.cpp)
34

45
[![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](https://opensource.org/licenses/MIT)
56
[![PyPi version](https://badgen.net/pypi/v/pyllamacpp)](https://pypi.org/project/pyllamacpp/)
@@ -21,15 +22,18 @@ For those who don't know, `llama.cpp` is a port of Facebook's LLaMA model in pur
2122
# Table of contents
2223
<!-- TOC -->
2324
* [Installation](#installation)
24-
* [Usage](#usage)
25-
* [Supported model](#supported-model)
26-
* [GPT4All](#gpt4all)
25+
* [CLI](#cli)
26+
* [Tutorial](#tutorial)
27+
* [Quick start](#quick-start)
28+
* [Interactive Dialogue](#interactive-dialogue)
29+
* [Different persona](#different-persona)
30+
* [Supported models](#supported-models)
2731
* [Discussions and contributions](#discussions-and-contributions)
2832
* [License](#license)
2933
<!-- TOC -->
3034

3135
# Installation
32-
1. The easy way is to use the prebuilt wheels
36+
1. The easy way is to install the prebuilt wheels
3337
```bash
3438
pip install pyllamacpp
3539
```
@@ -42,82 +46,145 @@ git clone --recursive https://github.com/nomic-ai/pyllamacpp && cd pyllamacpp
4246
pip install .
4347
```
4448

45-
# Usage
49+
# CLI
4650

47-
A simple `Pythonic` API is built on top of `llama.cpp` C/C++ functions. You can call it from Python as follows:
51+
You can run the flowering simple command line interface to test the package once it is installed:
4852

49-
```python
50-
from pyllamacpp.model import Model
53+
```shell
54+
pyllamacpp path/to/ggml/model
55+
```
5156

52-
def new_text_callback(text: str):
53-
print(text, end="", flush=True)
57+
```shell
58+
pyllamacpp -h
59+
60+
usage: pyllamacpp [-h] [--n_ctx N_CTX] [--n_parts N_PARTS] [--seed SEED] [--f16_kv F16_KV] [--logits_all LOGITS_ALL]
61+
[--vocab_only VOCAB_ONLY] [--use_mlock USE_MLOCK] [--embedding EMBEDDING] [--n_predict N_PREDICT] [--n_threads N_THREADS]
62+
[--repeat_last_n REPEAT_LAST_N] [--top_k TOP_K] [--top_p TOP_P] [--temp TEMP] [--repeat_penalty REPEAT_PENALTY]
63+
[--n_batch N_BATCH]
64+
model
65+
66+
positional arguments:
67+
model The path of the model file
68+
69+
options:
70+
-h, --help show this help message and exit
71+
--n_ctx N_CTX text context
72+
--n_parts N_PARTS
73+
--seed SEED RNG seed
74+
--f16_kv F16_KV use fp16 for KV cache
75+
--logits_all LOGITS_ALL
76+
the llama_eval() call computes all logits, not just the last one
77+
--vocab_only VOCAB_ONLY
78+
only load the vocabulary, no weights
79+
--use_mlock USE_MLOCK
80+
force system to keep model in RAM
81+
--embedding EMBEDDING
82+
embedding mode only
83+
--n_predict N_PREDICT
84+
Number of tokens to predict
85+
--n_threads N_THREADS
86+
Number of threads
87+
--repeat_last_n REPEAT_LAST_N
88+
Last n tokens to penalize
89+
--top_k TOP_K top_k
90+
--top_p TOP_P top_p
91+
--temp TEMP temp
92+
--repeat_penalty REPEAT_PENALTY
93+
repeat_penalty
94+
--n_batch N_BATCH batch size for prompt processing
5495

55-
model = Model(ggml_model='./models/gpt4all-model.bin', n_ctx=512)
56-
model.generate("Once upon a time, ", n_predict=55, new_text_callback=new_text_callback, n_threads=8)
5796
```
58-
If you don't want to use the `callback`, you can get the results from the `generate` method once the inference is finished:
97+
# Tutorial
98+
99+
### Quick start
100+
A simple `Pythonic` API is built on top of `llama.cpp` C/C++ functions. You can call it from Python as follows:
59101

60102
```python
61-
generated_text = model.generate("Once upon a time, ", n_predict=55)
62-
print(generated_text)
103+
from pyllamacpp.model import Model
104+
105+
model = Model(ggml_model='./models/gpt4all-model.bin')
106+
for token in model.generate("Tell me a joke ?"):
107+
print(token, end='')
63108
```
64109

65-
## Interactive Mode
110+
### Interactive Dialogue
111+
You can set up an interactive dialogue by simply keeping the `model` variable alive:
66112

67-
If you want to run the program in interactive mode you can add the `grab_text_callback` function and set `interactive` to True in the generate function. `grab_text_callback` should always return a string unless you wish to signal EOF in which case you should return None.
113+
```python
114+
from pyllamacpp.model import Model
115+
116+
model = Model(ggml_model='./models/gpt4all-model.bin')
117+
while True:
118+
try:
119+
prompt = input("You: ", flush=True)
120+
if prompt == '':
121+
continue
122+
print(f"AI:", end='')
123+
for tok in model.generate(prompt):
124+
print(f"{tok}", end='', flush=True)
125+
print()
126+
except KeyboardInterrupt:
127+
break
128+
```
129+
### Different persona
130+
You can customize the `prompt_context` to _"give the language model a different persona"_ as follows:
68131

69-
```py
132+
```python
70133
from pyllamacpp.model import Model
71134

72-
def new_text_callback(text: str):
73-
print(text, end="", flush=True)
135+
prompt_context = """ Act as Bob. Bob is helpful, kind, honest, good at writing, and never fails to answer the User's requests immediately and with precision. To do this, Bob uses a database of information collected from many different sources, including books, journals, online articles, and more.
74136
75-
def grab_text_callback():
76-
inpt = input()
77-
# To signal EOF, return None
78-
if inpt == "END":
79-
return None
80-
return inpt
137+
User: Nice to meet you Bob!
138+
Bob: Welcome! I'm here to assist you with anything you need. What can I do for you today?
139+
"""
81140

82-
model = Model(ggml_model='./models/gpt4all-model.bin', n_ctx=512)
141+
prompt_prefix = "\n User:"
142+
prompt_suffix = "\n Bob:"
83143

84-
# prompt from https://github.com/ggerganov/llama.cpp/blob/master/prompts/chat-with-bob.txt
85-
prompt = """
86-
Transcript of a dialog, where the User interacts with an Assistant named Bob. Bob is helpful, kind, honest, good at writing, and never fails to answer the User's requests immediately and with precision. To do this, Bob uses a database of information collected from many different sources, including books, journals, online articles, and more.
144+
model = Model(ggml_model=model, n_ctx=512, prompt_context=prompt_context, prompt_prefix=prompt_prefix,
145+
prompt_suffix=prompt_suffix)
87146

88-
User: Hello, Bob.
89-
Bob: Hello. How may I help you today?
90-
User: Please tell me the largest city in Europe.
91-
Bob: Sure. The largest city in Europe is Moscow, the capital of Russia.
92-
User:"""
147+
while True:
148+
try:
149+
prompt = input("You: ")
150+
if prompt == '':
151+
continue
152+
print(f"Bob:", end='')
153+
for tok in model.generate(prompt):
154+
print(f"{tok}", end='', flush=True)
155+
print()
156+
except KeyboardInterrupt:
157+
break
93158

94-
model.generate(prompt, n_predict=256, new_text_callback=new_text_callback, grab_text_callback=grab_text_callback, interactive=True, repeat_penalty=1.0, antiprompt=["User:"])
95159
```
96160

97-
* You can pass any `llama context` [parameter](https://nomic-ai.github.io/pyllamacpp/#pyllamacpp.constants.LLAMA_CONTEXT_PARAMS_SCHEMA) as a keyword argument to the `Model` class
98-
* You can pass any `gpt` [parameter](https://nomic-ai.github.io/pyllamacpp/#pyllamacpp.constants.GPT_PARAMS_SCHEMA) as a keyword argument to the `generarte` method
99-
* You can always refer to the [short documentation](https://nomic-ai.github.io/pyllamacpp/) for more details.
100161

162+
You can always refer to the [short documentation](https://abdeladim-s.github.io/pyllamacpp/) for more details.
101163

102-
# Supported model
103164

104-
### GPT4All
165+
# Supported models
105166

106-
Download a GPT4All model from https://the-eye.eu/public/AI/models/nomic-ai/gpt4all/.
107-
The easiest approach is download a file whose name ends in `ggml.bin`--older model versions require conversion.
167+
Fully tested with [GPT4All](https://github.com/nomic-ai/gpt4all) model, see [PyGPT4All](https://github.com/nomic-ai/pygpt4all).
108168

109-
If you have an older model downloaded that you want to convert, in your terminal run:
110-
```shell
111-
pyllamacpp-convert-gpt4all path/to/gpt4all_model.bin path/to/llama_tokenizer path/to/gpt4all-converted.bin
112-
```
169+
But all models supported by `llama.cpp` should be supported as well:
113170

114-
# FAQs
115-
* Where to find the llama tokenizer? [#5](https://github.com/nomic-ai/pyllamacpp/issues/5)
171+
<blockquote>
172+
173+
**Supported models:**
174+
175+
- [X] LLaMA 🦙
176+
- [X] [Alpaca](https://github.com/ggerganov/llama.cpp#instruction-mode-with-alpaca)
177+
- [X] [Chinese LLaMA / Alpaca](https://github.com/ymcui/Chinese-LLaMA-Alpaca)
178+
- [X] [Vigogne (French)](https://github.com/bofenghuang/vigogne)
179+
- [X] [Vicuna](https://github.com/ggerganov/llama.cpp/discussions/643#discussioncomment-5533894)
180+
- [X] [Koala](https://bair.berkeley.edu/blog/2023/04/03/koala/)
181+
182+
</blockquote>
116183

117184
# Discussions and contributions
118-
If you find any bug, please open an [issue](https://github.com/nomic-ai/pyllamacpp/issues).
185+
If you find any bug, please open an [issue](https://github.com/abdeladim-s/pyllamacpp/issues).
119186

120-
If you have any feedback, or you want to share how you are using this project, feel free to use the [Discussions](https://github.com/nomic-ai/pyllamacpp/discussions) and open a new topic.
187+
If you have any feedback, or you want to share how you are using this project, feel free to use the [Discussions](https://github.com/abdeladim-s/pyllamacpp/discussions) and open a new topic.
121188

122189
# License
123190

0 commit comments

Comments
 (0)