The same example as in the README.md page gives an error when you switch to LLAMA2-7B:
python3 -m manifest.api.app \
--model_type huggingface \
--model_generation_type text-generation \
--model_name_or_path NumbersStation/nsql-llama-2-7B \
--device 0
Gives following error:
The following `model_kwargs` are not used by the model: ['token_type_ids'] (note: typos in the generate arguments will also show up in this list)
127.0.0.1 - - [17/Dec/2023 15:53:35] "POST /completions HTTP/1.1" 400 -
I found that we need to modify the tokenizer like this:
inputs = tokenizer(
"###Instruction\nGenerate a python function to find number of CPU cores###Response\n",
return_tensors="pt",
return_token_type_ids=False, # <------------- put this here
)
But not sure where do we modify it.