Skip to content

Commit b057894

Browse files
authored
Support min_p in openai completions_v1 (#3506)
* Support min_p in openai completions_v1 * Support min_p in CompletionRequest protocol
1 parent 2d46035 commit b057894

File tree

2 files changed

+7
-0
lines changed

2 files changed

+7
-0
lines changed

lmdeploy/serve/openai/api_server.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -596,6 +596,11 @@ async def completions_v1(request: CompletionRequest, raw_request: Request = None
596596
this to False. This is setup to True in slow tokenizers.
597597
- top_k (int): The number of the highest probability vocabulary
598598
tokens to keep for top-k-filtering
599+
- min_p (float): Minimum token probability, which will be scaled by the
600+
probability of the most likely token. It must be a value between
601+
0 and 1. Typical values are in the 0.01-0.2 range, comparably
602+
selective as setting `top_p` in the 0.99-0.8 range (use the
603+
opposite of normal `top_p` values)
599604
600605
Currently we do not support the following features:
601606
- logprobs (not supported yet)
@@ -633,6 +638,7 @@ async def completions_v1(request: CompletionRequest, raw_request: Request = None
633638
ignore_eos=request.ignore_eos,
634639
stop_words=request.stop,
635640
skip_special_tokens=request.skip_special_tokens,
641+
min_p=request.min_p,
636642
random_seed=random_seed,
637643
spaces_between_special_tokens=request.spaces_between_special_tokens)
638644
generators = []

lmdeploy/serve/openai/protocol.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -276,6 +276,7 @@ class CompletionRequest(BaseModel):
276276
spaces_between_special_tokens: Optional[bool] = True
277277
top_k: Optional[int] = 40 # for opencompass
278278
seed: Optional[int] = None
279+
min_p: float = 0.0
279280

280281

281282
class CompletionResponseChoice(BaseModel):

0 commit comments

Comments
 (0)