Compare NLLB and MADLAD-400

Use the set of experiments provided by the projects team to compare the two models.

We'll want to test single-GPU (using LoRA for MADLAD) and multi-GPU performance (NLLB 3.3B and full MADLAD).