Skip to content

Conversation

@chupark
Copy link

@chupark chupark commented Oct 29, 2025

Pull Request Checklist

Thank you for your contribution! Before submitting your PR, please make sure you have completed the following checks:

Compatibility Check

  • I have checked whether this change affects the backward compatibility of the plugin declared in README.md

    • ✅ No changes
  • I have checked whether this change affects the forward compatibility of the plugin declared in README.md

    • ✅ Compatible with older Dify versions
  • If this change introduces a breaking change, I have discussed it with the project maintainer and specified the release version in the README.md

    • N/A (no breaking changes)
  • I have described the compatibility impact and the corresponding version number in the PR description

  • I have checked whether the plugin version is updated in the README.md

    • Version bumped from 0.5.1 → 0.6.1

Available Checks

  • Code has passed local tests

[ Authentication ]
Before
image

After
image

[Inference]
Before
image

After
image

  • Relevant documentation has been updated (if necessary)

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @chupark, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces enhanced support for OpenAI-compatible LLMs by adding a "reasoning option." It allows the system to differentiate between general maximum tokens and maximum completion tokens, adapting the API requests and UI dynamically based on whether a model supports reasoning thoughts. This change improves the flexibility and configurability of LLM integrations within the Dify plugin.

Highlights

  • New LLM Parameter: Introduced "MAX_COMPLETION_TOKENS" in "DefaultParameterName" to support models that differentiate between general "max_tokens" and tokens specifically for completion.
  • Dynamic Token Parameter Handling: Implemented logic to dynamically select between "max_tokens" and "max_completion_tokens" when validating credentials and generating model schemas, based on the "reasoning_thought_support" capability of the LLM.
  • Plugin Version Update: The "dify_plugin" version has been updated from "0.5.1" to "0.6.1" to reflect these new features.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for an LLM reasoning option for OpenAI compatible models, which involves using max_completion_tokens instead of max_tokens when reasoning is supported. The changes look good overall, but I've identified a couple of areas with code duplication and repeated function calls that could be improved for better maintainability and efficiency. My review includes suggestions to address these points.

Comment on lines +183 to +187
if credentials.get("reasoning_thought_support") == "supported":
# for reasoning thought support, they use max_completion_tokens
data = {"model": credentials.get("endpoint_model_name", model), "max_completion_tokens": 5}
else:
data = {"model": credentials.get("endpoint_model_name", model), "max_tokens": 5}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's some code duplication here in the construction of the data dictionary. The {"model": credentials.get("endpoint_model_name", model)} part is repeated in both branches of the if/else statement. To improve maintainability and adhere to the DRY (Don't Repeat Yourself) principle, you could factor out the common parts. For example:

model_name = credentials.get("endpoint_model_name", model)
if credentials.get("reasoning_thought_support") == "supported":
    # for reasoning thought support, they use max_completion_tokens
    data = {"model": model_name, "max_completion_tokens": 5}
else:
    data = {"model": model_name, "max_tokens": 5}

This would make the code cleaner and easier to maintain.

Comment on lines 351 to 352
name=get_max_token_param()["name"],
label=I18nObject(en_US=get_max_token_param()["label"], zh_Hans="最大标记"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The get_max_token_param() function is called twice here, once for the name and once for the label. This is inefficient. It would be better to call the function once before constructing the AIModelEntity, store its result in a variable, and then reuse that variable. This would avoid the redundant computation. For example:

max_token_param = get_max_token_param()
entity = AIModelEntity(
    # ...
    parameter_rules=[
        # ...
        ParameterRule(
            name=max_token_param["name"],
            label=I18nObject(en_US=max_token_param["label"], zh_Hans="最大标记"),
            # ...
        ),
    ],
    # ...
)

@chupark
Copy link
Author

chupark commented Oct 29, 2025

PRESENCE_PENALTY = "presence_penalty"
FREQUENCY_PENALTY = "frequency_penalty"
MAX_TOKENS = "max_tokens"
MAX_COMPLETION_TOKENS = "max_completion_tokens"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to add this parameter into PARAMETER_RULE_TEMPLATE, also, this need to be added to plugin-daemon

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also added MAX_COMPLETION_TOKENS into PARAMETER_RULE_TEMPLATE

features = []

# for reasoning thought support, they use max_completion_tokens
def get_max_token_param():
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be replaced by PARAMETER_RULE_TEMPLATE

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def get_max_token_param():
if credentials.get("reasoning_thought_support") == "supported":
max_token_param_name = DefaultParameterName.MAX_COMPLETION_TOKENS.value
max_token_param_label = "Max Completion Tokens"
else:
max_token_param_name = DefaultParameterName.MAX_TOKENS.value
max_token_param_label = "Max Tokens"

I'm trying to do this without defining a function.
I need to dynamically set the max_tokens and max_completion_tokens parameters depending on whether the LLM model supports reasoning, but I can't find a better way.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be replaced by PARAMETER_RULE_TEMPLATE

Sorry, this is the best I can do. If the model doesn't provide reasoning, the user should use MAX_TOKENS, and if it does, they should use MAX_COMPLETION_TOKEN, but I don't know a better way.

[project]
name = "dify_plugin"
version = "0.5.1"
version = "0.6.1"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please leave version unchanged.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I deleted it

@chupark chupark force-pushed the main branch 2 times, most recently from e651b6e to 8814057 Compare October 30, 2025 14:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants