-
Notifications
You must be signed in to change notification settings - Fork 105
feat: Add LLM reasoning option for OpenAI compatible #231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Summary of ChangesHello @chupark, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces enhanced support for OpenAI-compatible LLMs by adding a "reasoning option." It allows the system to differentiate between general maximum tokens and maximum completion tokens, adapting the API requests and UI dynamically based on whether a model supports reasoning thoughts. This change improves the flexibility and configurability of LLM integrations within the Dify plugin. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds support for an LLM reasoning option for OpenAI compatible models, which involves using max_completion_tokens instead of max_tokens when reasoning is supported. The changes look good overall, but I've identified a couple of areas with code duplication and repeated function calls that could be improved for better maintainability and efficiency. My review includes suggestions to address these points.
| if credentials.get("reasoning_thought_support") == "supported": | ||
| # for reasoning thought support, they use max_completion_tokens | ||
| data = {"model": credentials.get("endpoint_model_name", model), "max_completion_tokens": 5} | ||
| else: | ||
| data = {"model": credentials.get("endpoint_model_name", model), "max_tokens": 5} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's some code duplication here in the construction of the data dictionary. The {"model": credentials.get("endpoint_model_name", model)} part is repeated in both branches of the if/else statement. To improve maintainability and adhere to the DRY (Don't Repeat Yourself) principle, you could factor out the common parts. For example:
model_name = credentials.get("endpoint_model_name", model)
if credentials.get("reasoning_thought_support") == "supported":
# for reasoning thought support, they use max_completion_tokens
data = {"model": model_name, "max_completion_tokens": 5}
else:
data = {"model": model_name, "max_tokens": 5}This would make the code cleaner and easier to maintain.
| name=get_max_token_param()["name"], | ||
| label=I18nObject(en_US=get_max_token_param()["label"], zh_Hans="最大标记"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The get_max_token_param() function is called twice here, once for the name and once for the label. This is inefficient. It would be better to call the function once before constructing the AIModelEntity, store its result in a variable, and then reuse that variable. This would avoid the redundant computation. For example:
max_token_param = get_max_token_param()
entity = AIModelEntity(
# ...
parameter_rules=[
# ...
ParameterRule(
name=max_token_param["name"],
label=I18nObject(en_US=max_token_param["label"], zh_Hans="最大标记"),
# ...
),
],
# ...
)|
Related with this PR : |
| PRESENCE_PENALTY = "presence_penalty" | ||
| FREQUENCY_PENALTY = "frequency_penalty" | ||
| MAX_TOKENS = "max_tokens" | ||
| MAX_COMPLETION_TOKENS = "max_completion_tokens" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to add this parameter into PARAMETER_RULE_TEMPLATE, also, this need to be added to plugin-daemon
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks I've updated it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also added MAX_COMPLETION_TOKENS into PARAMETER_RULE_TEMPLATE
| features = [] | ||
|
|
||
| # for reasoning thought support, they use max_completion_tokens | ||
| def get_max_token_param(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be replaced by PARAMETER_RULE_TEMPLATE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| def get_max_token_param(): | |
| if credentials.get("reasoning_thought_support") == "supported": | |
| max_token_param_name = DefaultParameterName.MAX_COMPLETION_TOKENS.value | |
| max_token_param_label = "Max Completion Tokens" | |
| else: | |
| max_token_param_name = DefaultParameterName.MAX_TOKENS.value | |
| max_token_param_label = "Max Tokens" |
I'm trying to do this without defining a function.
I need to dynamically set the max_tokens and max_completion_tokens parameters depending on whether the LLM model supports reasoning, but I can't find a better way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be replaced by
PARAMETER_RULE_TEMPLATE
Sorry, this is the best I can do. If the model doesn't provide reasoning, the user should use MAX_TOKENS, and if it does, they should use MAX_COMPLETION_TOKEN, but I don't know a better way.
python/pyproject.toml
Outdated
| [project] | ||
| name = "dify_plugin" | ||
| version = "0.5.1" | ||
| version = "0.6.1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please leave version unchanged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I deleted it
e651b6e to
8814057
Compare
Pull Request Checklist
Thank you for your contribution! Before submitting your PR, please make sure you have completed the following checks:
Compatibility Check
I have checked whether this change affects the backward compatibility of the plugin declared in
README.mdI have checked whether this change affects the forward compatibility of the plugin declared in
README.mdIf this change introduces a breaking change, I have discussed it with the project maintainer and specified the release version in the
README.mdI have described the compatibility impact and the corresponding version number in the PR description
I have checked whether the plugin version is updated in the
README.mdAvailable Checks
[ Authentication ]

Before
After

[Inference]

Before
After
