Skip to content

Commit 6f65b74

Browse files
authored
Pass num_tokens_per_iter and max_prefill_iters params through in lmdeploy serve api_server (#3504)
1 parent 9901f76 commit 6f65b74

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

lmdeploy/cli/serve.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -268,6 +268,8 @@ def gradio(args):
268268
cache_block_seq_len=args.cache_block_seq_len,
269269
enable_prefix_caching=args.enable_prefix_caching,
270270
max_prefill_token_num=args.max_prefill_token_num,
271+
num_tokens_per_iter=args.num_tokens_per_iter,
272+
max_prefill_iters=args.max_prefill_iters,
271273
communicator=args.communicator)
272274
chat_template_config = get_chat_template(args.chat_template)
273275
run(args.model_path_or_server,

0 commit comments

Comments
 (0)