Adding Llama.cpp support with quantized models #16

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Draft

ArthurCamara wants to merge 3 commits into castorini:main from ArthurCamara:llama_cpp

ArthurCamara commented Sep 29, 2023

Adding support to Llama.cpp with quantized models.

8-bit model: https://huggingface.co/castorini/rank_vicuna_7b_v1_q8_0/
4-bit model: https://huggingface.co/castorini/rank_vicuna_7b_v1_q4_0/

ArthurCamara and others added 3 commits

September 29, 2023 20:41


          added rank_vicuna quantized

9638be8


          fixed tokenizer

809717d


          Update requirements.txt

507c086

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet