Skip to content

Commit 9d88632

Browse files
committed
minor version update: antiprompt mechanism removed
1 parent a2709d7 commit 9d88632

File tree

8 files changed

+291
-239
lines changed

8 files changed

+291
-239
lines changed

README.md

Lines changed: 47 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,14 @@
11
# PyLLaMACpp
2-
3-
* Python bindings for [llama.cpp](https://github.com/ggerganov/llama.cpp) + backend for [GPT4All](https://github.com/nomic-ai/pygpt4all) LLaMA models.
4-
52
[![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](https://opensource.org/licenses/MIT)
63
[![PyPi version](https://badgen.net/pypi/v/pyllamacpp)](https://pypi.org/project/pyllamacpp/)
74

5+
Python bindings for [llama.cpp](https://github.com/ggerganov/llama.cpp)
6+
7+
8+
<p align="center">
9+
<img src="./docs/demo.gif">
10+
</p>
11+
812

913
For those who don't know, `llama.cpp` is a port of Facebook's LLaMA model in pure C/C++:
1014

@@ -26,7 +30,8 @@ For those who don't know, `llama.cpp` is a port of Facebook's LLaMA model in pur
2630
* [Tutorial](#tutorial)
2731
* [Quick start](#quick-start)
2832
* [Interactive Dialogue](#interactive-dialogue)
29-
* [Different persona](#different-persona)
33+
* [Attribute a persona to the language model](#attribute-a-persona-to-the-language-model)
34+
* [API reference](#api-reference)
3035
* [Supported models](#supported-models)
3136
* [Discussions and contributions](#discussions-and-contributions)
3237
* [License](#license)
@@ -42,8 +47,7 @@ However, the compilation process of `llama.cpp` is taking into account the archi
4247
so you might need to build it from source:
4348

4449
```shell
45-
git clone --recursive https://github.com/nomic-ai/pyllamacpp && cd pyllamacpp
46-
pip install .
50+
pip install git+https://github.com/abdeladim-s/pyllamacpp.git
4751
```
4852

4953
# CLI
@@ -63,6 +67,8 @@ usage: pyllamacpp [-h] [--n_ctx N_CTX] [--n_parts N_PARTS] [--seed SEED] [--f16_
6367
[--n_batch N_BATCH]
6468
model
6569

70+
This is like a chatbot, You can start the conversation with `Hi, can you help me ?` Pay attention though that it may hallucinate!
71+
6672
positional arguments:
6773
model The path of the model file
6874

@@ -92,8 +98,8 @@ options:
9298
--repeat_penalty REPEAT_PENALTY
9399
repeat_penalty
94100
--n_batch N_BATCH batch size for prompt processing
95-
96101
```
102+
97103
# Tutorial
98104

99105
### Quick start
@@ -113,7 +119,7 @@ You can set up an interactive dialogue by simply keeping the `model` variable al
113119
```python
114120
from pyllamacpp.model import Model
115121

116-
model = Model(ggml_model='./models/gpt4all-model.bin')
122+
model = Model(model_path='/path/to/ggml/model')
117123
while True:
118124
try:
119125
prompt = input("You: ", flush=True)
@@ -126,40 +132,62 @@ while True:
126132
except KeyboardInterrupt:
127133
break
128134
```
129-
### Different persona
130-
You can customize the `prompt_context` to _"give the language model a different persona"_ as follows:
135+
### Attribute a persona to the language model
136+
137+
The following is an example showing how to _"attribute a persona to the language model"_ :
131138

132139
```python
133140
from pyllamacpp.model import Model
134141

135-
prompt_context = """ Act as Bob. Bob is helpful, kind, honest, good at writing, and never fails to answer the User's requests immediately and with precision. To do this, Bob uses a database of information collected from many different sources, including books, journals, online articles, and more.
142+
prompt_context = """Act as Bob. Bob is helpful, kind, honest,
143+
and never fails to answer the User's requests immediately and with precision.
136144
137145
User: Nice to meet you Bob!
138146
Bob: Welcome! I'm here to assist you with anything you need. What can I do for you today?
139147
"""
140148

141-
prompt_prefix = "\n User:"
142-
prompt_suffix = "\n Bob:"
149+
prompt_prefix = "\nUser:"
150+
prompt_suffix = "\nBob:"
143151

144-
model = Model(ggml_model=model, n_ctx=512, prompt_context=prompt_context, prompt_prefix=prompt_prefix,
152+
model = Model(model_path='/path/to/ggml/model',
153+
prompt_context=prompt_context,
154+
prompt_prefix=prompt_prefix,
145155
prompt_suffix=prompt_suffix)
146156

157+
sequence = ''
158+
stop_word = prompt_prefix.strip()
159+
147160
while True:
148161
try:
149162
prompt = input("You: ")
150163
if prompt == '':
151164
continue
152-
print(f"Bob:", end='')
153-
for tok in model.generate(prompt):
154-
print(f"{tok}", end='', flush=True)
165+
print(f"AI: ", end='')
166+
for token in model.generate(prompt):
167+
if token == '\n':
168+
sequence += token
169+
continue
170+
if len(sequence) != 0:
171+
if stop_word.startswith(sequence.strip()):
172+
sequence += token
173+
if sequence.strip() == stop_word:
174+
sequence = ''
175+
break
176+
else:
177+
continue
178+
else:
179+
print(f"{sequence}", end='', flush=True)
180+
sequence = ''
181+
print(f"{token}", end='', flush=True)
182+
155183
print()
156184
except KeyboardInterrupt:
157185
break
158-
159186
```
160187

161188

162-
You can always refer to the [short documentation](https://abdeladim-s.github.io/pyllamacpp/) for more details.
189+
# API reference
190+
You can check the [API reference documentation](https://abdeladim-s.github.io/pyllamacpp/) for more details.
163191

164192

165193
# Supported models

docs/demo.gif

686 KB
Loading

docs/index.md

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,4 @@
33

44
::: pyllamacpp.model
55

6-
::: pyllamacpp.constants
7-
options:
8-
show_if_no_docstring: true
9-
106
::: pyllamacpp.utils

pyllamacpp/cli.py

Lines changed: 148 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -9,8 +9,6 @@
99
import importlib.metadata
1010
import logging
1111

12-
import pyllamacpp.constants as constants
13-
1412
__version__ = importlib.metadata.version('pyllamacpp')
1513

1614
__header__ = f"""
@@ -25,27 +23,132 @@
2523
2624
PyLLaMACpp
2725
A simple Command Line Interface to test the package
28-
Version: {__version__}
26+
Version: {__version__}
27+
28+
2929
=========================================================================================
3030
"""
3131

3232
from pyllamacpp.model import Model
3333

34+
LLAMA_CONTEXT_PARAMS_SCHEMA = {
35+
'n_ctx': {
36+
'type': int,
37+
'description': "text context",
38+
'options': None,
39+
'default': -1
40+
},
41+
'n_parts': {
42+
'type': int,
43+
'description': "",
44+
'options': None,
45+
'default': -1
46+
},
47+
'seed': {
48+
'type': int,
49+
'description': "RNG seed",
50+
'options': None,
51+
'default': -1
52+
},
53+
'f16_kv': {
54+
'type': bool,
55+
'description': "use fp16 for KV cache",
56+
'options': None,
57+
'default': 0
58+
},
59+
'logits_all': {
60+
'type': bool,
61+
'description': "the llama_eval() call computes all logits, not just the last one",
62+
'options': None,
63+
'default': 0
64+
},
65+
'vocab_only': {
66+
'type': bool,
67+
'description': "only load the vocabulary, no weights",
68+
'options': None,
69+
'default': 0
70+
},
71+
'use_mlock': {
72+
'type': bool,
73+
'description': "force system to keep model in RAM",
74+
'options': None,
75+
'default': 0
76+
},
77+
'embedding': {
78+
'type': bool,
79+
'description': "embedding mode only",
80+
'options': None,
81+
'default': 0
82+
}
83+
}
84+
85+
GPT_PARAMS_SCHEMA = {
86+
'n_predict': {
87+
'type': int,
88+
'description': "Number of tokens to predict",
89+
'options': None,
90+
'default': 50
91+
},
92+
'n_threads': {
93+
'type': int,
94+
'description': "Number of threads",
95+
'options': None,
96+
'default': 4
97+
},
98+
'repeat_last_n': {
99+
'type': int,
100+
'description': "Last n tokens to penalize",
101+
'options': None,
102+
'default': 64
103+
},
104+
# sampling params
105+
'top_k': {
106+
'type': int,
107+
'description': "top_k",
108+
'options': None,
109+
'default': 40
110+
},
111+
'top_p': {
112+
'type': float,
113+
'description': "top_p",
114+
'options': None,
115+
'default': 0.95
116+
},
117+
'temp': {
118+
'type': float,
119+
'description': "temp",
120+
'options': None,
121+
'default': 0.8
122+
},
123+
'repeat_penalty': {
124+
'type': float,
125+
'description': "repeat_penalty",
126+
'options': None,
127+
'default': 1.3
128+
},
129+
'n_batch': {
130+
'type': int,
131+
'description': "batch size for prompt processing",
132+
'options': None,
133+
'default': True
134+
}
135+
}
136+
34137

35138
def _get_llama_context_params(args) -> dict:
36139
"""
37140
Helper function to get params from argparse as a `dict`
38141
"""
39142
params = {}
40143
for arg in args.__dict__:
41-
if arg in constants.LLAMA_CONTEXT_PARAMS_SCHEMA.keys() and getattr(args, arg) is not None:
42-
if constants.LLAMA_CONTEXT_PARAMS_SCHEMA[arg]['type'] is bool:
144+
if arg in LLAMA_CONTEXT_PARAMS_SCHEMA.keys() and getattr(args, arg) is not None:
145+
if LLAMA_CONTEXT_PARAMS_SCHEMA[arg]['type'] is bool:
43146
if getattr(args, arg).lower() == 'false':
44147
params[arg] = False
45148
else:
46149
params[arg] = True
47150
else:
48-
params[arg] = constants.LLAMA_CONTEXT_PARAMS_SCHEMA[arg]['type'](getattr(args, arg))
151+
params[arg] = LLAMA_CONTEXT_PARAMS_SCHEMA[arg]['type'](getattr(args, arg))
49152
return params
50153

51154

@@ -55,14 +158,14 @@ def _get_gpt_params(args) -> dict:
55158
"""
56159
params = {}
57160
for arg in args.__dict__:
58-
if arg in constants.GPT_PARAMS_SCHEMA.keys() and getattr(args, arg) is not None:
59-
if constants.GPT_PARAMS_SCHEMA[arg]['type'] is bool:
161+
if arg in GPT_PARAMS_SCHEMA.keys() and getattr(args, arg) is not None:
162+
if GPT_PARAMS_SCHEMA[arg]['type'] is bool:
60163
if getattr(args, arg).lower() == 'false':
61164
params[arg] = False
62165
else:
63166
params[arg] = True
64167
else:
65-
params[arg] = constants.GPT_PARAMS_SCHEMA[arg]['type'](getattr(args, arg))
168+
params[arg] = GPT_PARAMS_SCHEMA[arg]['type'](getattr(args, arg))
66169
return params
67170

68171

@@ -78,24 +181,49 @@ class bcolors:
78181
UNDERLINE = '\033[4m'
79182

80183

184+
PROMPT_CONTEXT = "Below is an instruction that describes a task. Write a response that appropriately completes the " \
185+
"request\n"
186+
PROMPT_PREFIX = "\n\n##Instruction:\n"
187+
PROMPT_SUFFIX = "\n\n##Response:\n"
188+
189+
81190
def run(args):
82191
print(f"[+] Running model `{args.model}`")
83192
llama_params = _get_llama_context_params(args)
84193
print(f"[+] LLaMA context params: `{llama_params}`")
85194
gpt_params = _get_gpt_params(args)
86195
print(f"[+] GPT params: `{gpt_params}`")
87-
model = Model(ggml_model=args.model, **llama_params)
196+
model = Model(model_path=args.model,
197+
prompt_context=PROMPT_CONTEXT,
198+
prompt_prefix=PROMPT_PREFIX,
199+
prompt_suffix=PROMPT_SUFFIX,
200+
**llama_params)
88201
print("...")
89202
print("[+] Press Ctrl+C to Stop ... ")
90203
print("...")
204+
sequence = ''
91205
while True:
92206
try:
93207
prompt = input("You: ")
94208
if prompt == '':
95209
continue
96-
print(f"{bcolors.OKCYAN}AI: {bcolors.ENDC}", end='', flush=True)
97-
for tok in model.generate(prompt, **gpt_params):
98-
print(f"{bcolors.OKCYAN}{tok}{bcolors.ENDC}", end='', flush=True)
210+
print(f"{bcolors.OKBLUE}AI: {bcolors.ENDC}", end='', flush=True)
211+
for token in model.generate(prompt, **gpt_params):
212+
if token == '\n':
213+
sequence += token
214+
continue
215+
if len(sequence) != 0:
216+
if PROMPT_PREFIX.strip().startswith(sequence.strip()):
217+
sequence += token
218+
if sequence.strip() == PROMPT_PREFIX.strip():
219+
sequence = ''
220+
break
221+
else:
222+
continue
223+
else:
224+
print(f"{sequence}", end='', flush=True)
225+
sequence = ''
226+
print(f"{bcolors.OKCYAN}{token}{bcolors.ENDC}", end='', flush=True)
99227
print()
100228
except KeyboardInterrupt:
101229
break
@@ -104,18 +232,20 @@ def run(args):
104232
def main():
105233
print(__header__)
106234

107-
parser = argparse.ArgumentParser(description="", allow_abbrev=True)
235+
parser = argparse.ArgumentParser(description="This is like a chatbot, You can start the conversation with `Hi, "
236+
"can you help me ?`\nPay attention though that it may hallucinate!",
237+
allow_abbrev=True)
108238
# Positional args
109239
parser.add_argument('model', type=str, help="The path of the model file")
110240

111241
# add params from LLAMA_CONTEXT_PARAMS_SCHEMA
112-
for param in constants.LLAMA_CONTEXT_PARAMS_SCHEMA:
113-
param_fields = constants.LLAMA_CONTEXT_PARAMS_SCHEMA[param]
242+
for param in LLAMA_CONTEXT_PARAMS_SCHEMA:
243+
param_fields = LLAMA_CONTEXT_PARAMS_SCHEMA[param]
114244
parser.add_argument(f'--{param}',
115245
help=f'{param_fields["description"]}')
116246

117-
for param in constants.GPT_PARAMS_SCHEMA:
118-
param_fields = constants.GPT_PARAMS_SCHEMA[param]
247+
for param in GPT_PARAMS_SCHEMA:
248+
param_fields = GPT_PARAMS_SCHEMA[param]
119249
parser.add_argument(f'--{param}',
120250
help=f'{param_fields["description"]}')
121251

0 commit comments

Comments
 (0)