[QUESTION] How to load checkpoint saved in one parallel configuration (tensor/pipeline/data parallelism) can be loaded in a different parallel configuration ? #1242
Unanswered
polisettyvarma
asked this question in
Q&A
Replies: 2 comments 1 reply
-
|
You can checkout convert in tools folder, |
Beta Was this translation helpful? Give feedback.
1 reply
-
|
Can someone answer this ? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
How to load checkpoint saved in one parallel configuration (tensor/pipeline/data parallelism) can be loaded in a different parallel configuration ?
Based on this doc - https://github.com/NVIDIA/Megatron-LM/blob/main/docs/source/api-guide/dist_checkpointing.rst
There are some conflicting statements.
Can you provide a working example end to end to showcase this feature, Thanks.
Beta Was this translation helpful? Give feedback.
All reactions