[QUESTION]Using FP8 OOM, otherwise --bf16 works well #1241
Unanswered
yanchenmochen
asked this question in
Q&A
Replies: 1 comment
-
|
same :( |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
When I train a 7B model on H100 GPU using FP8, it turns out OOM, while the same parameters using - bf16 can be trained fine, what is the possible problem?
I tried to reduce memory by --recompute-granularity selective, but it failed.
the error info is
Beta Was this translation helpful? Give feedback.
All reactions