add qwen3-8b with megatron v0.15.0rc5 #914

Murkylove · 2025-11-10T07:40:21Z

PR Category

PR Types

PR Description

gemini-code-assist

Code Review

This pull request introduces a training configuration for the qwen3-8b model. While the changes align with the PR's goal, the new configuration file examples/qwen3/conf/train/8b.yaml contains critical issues. Specifically, it uses a placeholder for the data_path and an incorrect tokenizer_path which points to a tokenizer for a different model. These will prevent the training from running correctly. I have also suggested a minor improvement for consistency in boolean value representation.

gemini-code-assist · 2025-11-10T07:41:23Z

examples/qwen3/conf/train/8b.yaml

+
+
+data:
+  data_path: /path/to/dataset


The data_path is set to a placeholder value /path/to/dataset. This will cause the training script to fail because it cannot locate the dataset. This path must be updated to a valid location of the training data before running the script.

gemini-code-assist · 2025-11-10T07:41:23Z

examples/qwen3/conf/train/8b.yaml

+  no_mmap_bin_files: true
+  tokenizer:
+    tokenizer_type: HuggingFaceTokenizer
+    tokenizer_path: examples/aquila/tokenizer_hf


The tokenizer_path is set to examples/aquila/tokenizer_hf, which is intended for an Aquila model and uses a GPT2Tokenizer. Using a tokenizer that does not match the qwen3-8b model will result in incorrect tokenization, leading to failed training or a poorly performing model. This path must be updated to point to the correct tokenizer for qwen3-8b.

gemini-code-assist · 2025-11-10T07:41:23Z

examples/qwen3/conf/train/8b.yaml

+  reset_position_ids: True
+  reset_attention_mask: True


The boolean values are specified as True with an uppercase 'T'. While many YAML parsers accept this, the YAML specification and best practices favor lowercase true and false. For consistency with other boolean values in this file (e.g., disable_bias_linear: true) and to ensure compatibility across different environments, these should be changed to lowercase.

reset_position_ids: true reset_attention_mask: true

CLAassistant · 2025-11-18T13:06:59Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you all sign our Contributor License Agreement before we can accept your contribution.
0 out of 2 committers have signed the CLA.

❌ shaojunsong
❌ Murkylove

shaojunsong seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

add qwen3-8b with megatron v0.15.0rc5

3c5dbc4

Murkylove requested review from a team and Caozhou1995 as code owners November 10, 2025 07:40

gemini-code-assist bot reviewed Nov 10, 2025

View reviewed changes

add qwen3-8b flagscale

cd4c07b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add qwen3-8b with megatron v0.15.0rc5 #914

add qwen3-8b with megatron v0.15.0rc5 #914

Uh oh!

Murkylove commented Nov 10, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Nov 10, 2025

Uh oh!

gemini-code-assist bot Nov 10, 2025

Uh oh!

gemini-code-assist bot Nov 10, 2025

Uh oh!

CLAassistant commented Nov 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

add qwen3-8b with megatron v0.15.0rc5 #914

Are you sure you want to change the base?

add qwen3-8b with megatron v0.15.0rc5 #914

Uh oh!

Conversation

Murkylove commented Nov 10, 2025

PR Category

PR Types

PR Description

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

CLAassistant commented Nov 18, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants