[QUESTION] vicuna-7b-v1.5 weight conversion from huggingface to megatron-lm format #1181
Replies: 6 comments
-
|
I'm also interested in this, and more generally how Megatron can be used to convert from HF, continue pretraining, and convert back to HF. |
Beta Was this translation helpful? Give feedback.
-
|
same issue on different model |
Beta Was this translation helpful? Give feedback.
-
|
My understanding is that |
Beta Was this translation helpful? Give feedback.
-
|
Also, if you do need |
Beta Was this translation helpful? Give feedback.
-
Thanks, man. I use And I change the convert command into Anyway, your answer really helps me a lot, before that I checked for a long time to figure out the problems |
Beta Was this translation helpful? Give feedback.
-
|
Marking as stale. No activity in 60 days. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I am trying to convert the weight for
vicuna-7b-v1.5in huggingface transformers ( https://huggingface.co/lmsys/vicuna-7b-v1.5 ) to be used with megatron-lm.I am using
tools/checkpoint/convert.pyto do the conversion.The command I used is as follows:
When I run it, I get an error like this:
I looked into it, and it seems this error happens here:
Megatron-LM/megatron/core/parallel_state.py
Lines 563 to 569 in 7fe863f
because
_TENSOR_MODEL_PARALLEL_GROUPdoes not have a value set.However, I found that
_TENSOR_MODEL_PARALLEL_GROUPis only set here in the whole code:Megatron-LM/megatron/core/parallel_state.py
Line 379 in 7fe863f
and this function
initialize_model_paralleldoes not seem to be called during the weight conversion.How can I correctly do the weight conversion?
Beta Was this translation helpful? Give feedback.
All reactions