Use the set of experiments provided by the projects team to compare the two models. We'll want to test single-GPU (using LoRA for MADLAD) and multi-GPU performance (NLLB 3.3B and full MADLAD).