Update doc

AAnoosheh · AAnoosheh · commit f60f9c8c818e · 2025-11-19T09:20:18.000-08:00
Signed-off-by: Asha Anoosheh &lt;aanoosheh@nvidia.com&gt;
diff --git a/megatron/post_training/docs/distillation.md b/megatron/post_training/docs/distillation.md
@@ -75,7 +75,7 @@ Model Optimizer modifies the model using the loss criterion present in the disti
 defines a loss function between two module attribute names of the teacher and student model, respectively.
 
 Default loss function used between logits is a KL-Divergence Loss and loss used among intermediate tensors is Cosine-Similarity,
-both defined in `megatron/inference/algos/distillation.py`.
+both defined in `modelopt.torch.distill.plugins.megatron`.
 
 ## Restrictions