Hi, thanks for releasing LIMO and the training configs for 32B!
In the paper you also reported experiments on 7B, but I couldn’t find the corresponding training parameters/configs in this repo.
Could you please share the training hyperparameters and config files for the 7B experiments (e.g., learning rate, batch size, optimizer settings, training steps, DeepSpeed config)? This would be very helpful for reproducibility and fair comparison.
Hi, thanks for releasing LIMO and the training configs for 32B!
In the paper you also reported experiments on 7B, but I couldn’t find the corresponding training parameters/configs in this repo.
Could you please share the training hyperparameters and config files for the 7B experiments (e.g., learning rate, batch size, optimizer settings, training steps, DeepSpeed config)? This would be very helpful for reproducibility and fair comparison.