Skip to content

Conversation

@Xingfu-Yi
Copy link
Contributor

Background

Not all pretrained LLMs use <|endoftext|> as the eot_token, therefore it's inappropriate to fix it.

Changes

  • Removed the hardcoded eot_token: args.end_of_conversation_token = "<|endoftext|>".
  • Added a new argument in the parser called eot_token which is <|endoftext|> by default. Users can manually set the token according to the pretrained model they use.

Not all pretrained LLMs use `<|endoftext|>` as the `eot_token`, therefore it's inappropriate to fix it.
@Xingfu-Yi
Copy link
Contributor Author

Hi @arashb, @duli2012, @awan-10, @eltonzheng,

I hope you're doing well. When you have a moment, could you kindly take a look at this PR? It has already received one approval, but it seems to be stuck and needs further reviews to move forward.

Thank you so much in advance for your time and help.

Best regards,
Yi

@loadams
Copy link
Contributor

loadams commented Oct 29, 2024

Hi @arashb, @duli2012, @awan-10, @eltonzheng,

I hope you're doing well. When you have a moment, could you kindly take a look at this PR? It has already received one approval, but it seems to be stuck and needs further reviews to move forward.

Thank you so much in advance for your time and help.

Best regards, Yi

Hi @Xingfu-Yi - we will work on getting this PR merged, sorry for the delay.

@loadams loadams merged commit eefb0ef into deepspeedai:master Oct 30, 2024
2 checks passed
zhangsmallshark pushed a commit to zhangsmallshark/DeepSpeedExamples that referenced this pull request Feb 12, 2025
Not all pretrained LLMs use `<|endoftext|>` as the `eot_token`, therefore it's inappropriate to fix it.

Co-authored-by: Olatunji Ruwase <[email protected]>
Co-authored-by: Logan Adams <[email protected]>
Signed-off-by: zhangsmallshark <[email protected]>
hwchen2017 pushed a commit that referenced this pull request Jun 8, 2025
Not all pretrained LLMs use `<|endoftext|>` as the `eot_token`, therefore it's inappropriate to fix it.

Co-authored-by: Olatunji Ruwase <[email protected]>
Co-authored-by: Logan Adams <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants