-
Notifications
You must be signed in to change notification settings - Fork 586
major improvements to tool calling logic #1839
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
major improvements to tool calling logic #1839
Conversation
This reverts commit 70902ed.
…front and center for the LLM so that it doesn't repeat tool calls from earlier in the conversation. it still feeds the last 6 turns of chat history to the llm to enable followups
|
thanks. it will take me some time to review as i am waiting to get my laptop back |
b0f20ae to
055fdce
Compare
|
Function wise it performs better than our current implementation, speed wise for me its very minor it may effect people with slow PP / gen times more. But those people can have an escape in the upcoming jinja implementation. I'd say we can merge this if you review the code. |
yup, can definitely confirm that it's affected by slow PP. but it's a lot better than the old way of doing it! with the old way, my 4b qwen3 model was all too eager to run tools. even just saying a simple "hi" to it triggered a tool call! i'm glad you guys are implementing jinja as well though! in my opinion thats superior for toolcalling, because it just works as it should, in the way the llamacpp server already does. but at least with this change the native toolcalling in kobold isn't as eager to randomly run tools, and more accurately chooses which tools to run looking forward to the code review! |
994427d to
cdc18f0
Compare
| for name in toolnames: | ||
| pollgrammar += ("" if pollgrammar=="" else " | ") | ||
| pollgrammar += "\"" + name + "\"" | ||
| pollgrammar += " | \"no_tool\"" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why remove the null tool? seems like its still a good idea to have even if the llm knows that a tool is needed. it gives it a second chance to change its mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because in testing this often resulted in toolcalls that i explicitely asked for being cancelled. best to just not interfere with its final reasoning..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Formerly the null tool was to give the LLM an out they would go for, rosie's PR instead does this with a reasoning step. Same concept different implementation.
|
Consolidated testing results:
your current prompt unwrapping: reworked prompt unwrap: and the AI replied: |
|
really sorry for the huge delay, i was caught up with laptop repairs and troubleshooting. I've made some tweaks to the PR and added some comments, do you think you could give it a try and let me know if it works well for you? |
|
it's okay. sure, i'll try. i'm sorry for messing up the git branch, hope all the effort will be worth it! please check out the new (hopefully fixed) pull request and add your changes to it |
as we've discussed in discord,
this change alters the way koboldcpp determines what tool to use in some pretty drastic ways that vastly improve it's accuracy, especially with small LLM's. instead of doing one request to the LLM to prompt it and ask it if a tool should be used, and forcing it down to 5 tokens with grammar forcing a simple "yes/no" answer.. it now gives the LLM full freedom to write out it's decision and why it took that decision, with it's final decision text always added at the end of the response. then we take that response and use the yes/no grammar on that instead!