-
Notifications
You must be signed in to change notification settings - Fork 13
Open
Description
Problems
When testing LLaVA-v1.5 with eval.py, the following error occurs.
*** RuntimeError: Expected all tensors to be on the same device, but found at least two devices,
cuda:0 and cuda:1! (when checking argument for argument mat1 in method wrapper_CUDA_addmm)
This is because when using huggingface to load the model, the default parameter device_map="auto", the model will be loaded to multiple GPUs (Pipeline Parallelism).
def load_pretrained_model(model_path, model_base, model_name, \
load_8bit=False, load_4bit=False, device_map="auto", device="cuda", **kwargs):
...While in eval.py, the wrapped model(MLLM_Tester) will be called the cuda method, and the model parameters will be loaded to the default gpu again.
Line 171 in fbc5f2c
| model = build_model(args.model).cuda() |
With the AlignDevicesHook conflict, the data is loaded to other gpus in some layer, and now all the parameters are on the default gpu, which triggers the error report.
Solution
I think removing .cuda() here is ok, though I only check the llava interface.
Line 171 in fbc5f2c
| model = build_model(args.model).cuda() |
Metadata
Metadata
Assignees
Labels
No labels