Megatron FP8 training is compatible with recompute? #1252
Replies: 1 comment
-
|
Hi yanchenmochen, |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Your question
Ask a clear and concise question about Megatron-LM.
when I tried run the training using fp8 parameters, OOM occurs. the model is 7B model. same parameter is valid for --bf16.
How to set he correct fp8 parameter? I tried to reduce memory usage to avoid OOM by using--recalculate-granularity selective, but failed, still OOM
Beta Was this translation helpful? Give feedback.
All reactions