v6.9.1 - Welcome Kobold!
Patch 6.9.1 Notes
In between major updates I'll simply paste below the major update notes so it's more convenient, and then include specific notes for minor updates.
- Added
Qwen 2.5 - 32bchat model. - Add sparkgraphs for metrics and the ability to right-click on the metrics bar and select a different visualization.
Welcome Kobold edition v6.9.0
Ask Jeeves!
- Exciting new "Ask Jeeves" helper who answers questions about how to use the program. Simply click "Jeeves" in the upper left.
- "Jeeves" gets his knowledge from a vector database that comes shipped with this release! NO MORE USER GUIDE TAB - just ASK JEEVES!
- IMPORTANT: After running
setup_windows.pyyou must go into theAssetsfolder, right-click onkoboldcpp_nocuda.exe, and check the "Unblock" checkbox first! If it's not there, try starting Jeeves and see if it works. Create a Github Issue if it doesn't work because Ask Jeeves is a new feature. - IMPORTANT: You may also need to disable or make an exception for any firewall you have. Submit a Github
Issueif you encounter any problems.
- IMPORTANT: After running
Scrape Python Library Documentation
- In the Tools Tab, simply select a python library, click
Scrape, and all the.htmlfiles will be downloaded to theScraped_Documentationfolder. - Create a vector database out of all of the
.htmlfiles for a given library, then use one of the coding specific models to answer questions!
Huggingface Access Token
- You can now enter an "access token" and access models that are "gated" on huggingface. Currently,
llama 3.2 - 3bandmistral-small - 22bare the only gated models. - Ask Jeeves how to get a huggingface access token.
Other Improvements
- The vector models are now downloaded using the
snapshot_downloadfunctionality fromhuggingface_hub, which can exclude unnecessary files such asonnx,.bin(when an equivalent.safetensorsversion is available), and others. This significantly reduces the amount of data that this program downloads and therefore increases speed and usability. - This speedup should pertain to vector, chat, and whisper models, and implementing the
snapshot_downloadfor TTS models is planned. - New
Compare GPUsbutton in the Tools Tab, which displays metrics for various GPUs so you can better determine your settings. Charts and graphs for chat/vision models will be added in the near future. - New metrics bar with speedometer-looking widgets.
- Removed the User Guide Tab altogether to free up space. You can now simply
Ask Jeevesinstead. - Lots and lots of refactoring to improve various things...
Added/Removed Chat Models
- Added
Qwen 2.5 - 1.5b,Llama 3.2 - 3b,Internlm 2.5 - 1.8b,Dolphin-Llama 3.1 - 8b,Mistral-Small - 22b. - Removed
Longwriter Llama 3.1 - 8b,Longwriter GLM4 - 9b,Yi - 9b,Solar Pro Preview - 22.1b.
Added/Removed Vision Models
- Removed
Llava 1.5,Bakllava,Falcon-vlm - 11b, andPhi-3-Visionmodels as either under-performing or eclipsed by pre-existing models that have additional benefits.
Roadmap
- Add
Koboldas a backend in addition toLM StudioandLocal Models, at which point I'll probably have to rename this github repo. - Add
OpenAIbackend. - Remove LM Studio Server settings and revise instructions since LM Studio has changed significantly since they were last done.
Full Changelog: v6.8.2...v6.9.0
