-
Notifications
You must be signed in to change notification settings - Fork 592
Open
Description
I am currently exploring the ChatLaw models and I have a few questions regarding their training schemes and roles within the ensemble model.
-
Could you please provide detailed information about the training schemes used for ChatLaw2_plain and ChatLaw2E_plain? Specifically, I am interested in the datasets, preprocessing steps, model architectures, and any fine-tuning techniques applied.
-
Additionally, I would like to understand the role that ChatLaw2_plain and ChatLaw2E_plain play within the ChatLaw2_MOE (Mixture of Experts) model. How do these models interact and contribute to the overall performance of ChatLaw2_MOE?
Thank you in advance for your assistance. I look forward to your response.
Metadata
Metadata
Assignees
Labels
No labels