Skip to content

Conversation

@Murkylove
Copy link

PR Category

PR Types

PR Description

@Murkylove Murkylove requested review from a team and Caozhou1995 as code owners November 10, 2025 07:40
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a training configuration for the qwen3-8b model. While the changes align with the PR's goal, the new configuration file examples/qwen3/conf/train/8b.yaml contains critical issues. Specifically, it uses a placeholder for the data_path and an incorrect tokenizer_path which points to a tokenizer for a different model. These will prevent the training from running correctly. I have also suggested a minor improvement for consistency in boolean value representation.



data:
data_path: /path/to/dataset
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The data_path is set to a placeholder value /path/to/dataset. This will cause the training script to fail because it cannot locate the dataset. This path must be updated to a valid location of the training data before running the script.

no_mmap_bin_files: true
tokenizer:
tokenizer_type: HuggingFaceTokenizer
tokenizer_path: examples/aquila/tokenizer_hf
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The tokenizer_path is set to examples/aquila/tokenizer_hf, which is intended for an Aquila model and uses a GPT2Tokenizer. Using a tokenizer that does not match the qwen3-8b model will result in incorrect tokenization, leading to failed training or a poorly performing model. This path must be updated to point to the correct tokenizer for qwen3-8b.

Comment on lines +8 to +9
reset_position_ids: True
reset_attention_mask: True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The boolean values are specified as True with an uppercase 'T'. While many YAML parsers accept this, the YAML specification and best practices favor lowercase true and false. For consistency with other boolean values in this file (e.g., disable_bias_linear: true) and to ensure compatibility across different environments, these should be changed to lowercase.

  reset_position_ids: true
  reset_attention_mask: true

@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
0 out of 2 committers have signed the CLA.

❌ shaojunsong
❌ Murkylove


shaojunsong seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
You have signed the CLA already but the status is still pending? Let us recheck it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants