Changes to support latent MoEs #2296

deepakn94 · 2025-11-19T02:16:35Z

No description provided.

venmugil · 2025-11-21T19:18:19Z

megatron/core/transformer/moe/moe_layer.py

+            # Project the output back from latent dimension to hidden dimension after combine
+            # in latent dimension.
+            if self.config.moe_latent_size:
+                output, _ = self.fc2_latent_proj(output)


fc2_latent_proj might return a bias if self.config.add_bias_linear is set.

Good point. But what should be the right behavior here if this layer does have a bias?

venmugil · 2025-11-21T19:22:18Z

megatron/core/transformer/moe/moe_layer.py

+            if self.config.moe_latent_size and mlp_bias is not None:
+                output = output + mlp_bias
+                mlp_bias = None
            output = self.combine(output, shared_expert_output)


output will be in latent dimension, while shared_expert_output will be in hidden dimension here. We may have to move self.fc2_latent_proj inside self.combine before the addition of output and shared_expert_output.

Good point, moved.

Signed-off-by: Deepak Narayanan <[email protected]>

…tent-size Signed-off-by: Deepak Narayanan <[email protected]>

… right shape Signed-off-by: Deepak Narayanan <[email protected]>

deepakn94 requested review from a team as code owners November 19, 2025 02:16

copy-pr-bot bot temporarily deployed to nemo-ci November 19, 2025 02:16 Inactive

deepakn94 added this to the Core 0.16 milestone Nov 19, 2025

copy-pr-bot bot temporarily deployed to nemo-ci November 19, 2025 02:16 Inactive

deepakn94 self-assigned this Nov 19, 2025

copy-pr-bot bot had a problem deploying to nemo-ci November 19, 2025 02:16 Failure

copy-pr-bot bot temporarily deployed to test November 19, 2025 02:17 Inactive

copy-pr-bot bot temporarily deployed to public November 19, 2025 02:20 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci November 19, 2025 02:24 Inactive

copy-pr-bot bot had a problem deploying to nemo-ci November 19, 2025 02:25 Error

copy-pr-bot bot temporarily deployed to nemo-ci November 19, 2025 02:25 Inactive

copy-pr-bot bot had a problem deploying to nemo-ci November 19, 2025 02:25 Error

copy-pr-bot bot temporarily deployed to nemo-ci November 19, 2025 02:25 Inactive

deepakn94 force-pushed the dnarayanan/latent_moe branch from d088236 to b0a2d8c Compare November 19, 2025 02:32

copy-pr-bot bot temporarily deployed to nemo-ci November 19, 2025 02:32 Inactive

venmugil suggested changes Nov 21, 2025

View reviewed changes

venmugil and others added 3 commits November 21, 2025 16:33

Changes to support latent MoEs

02b8c0a

Signed-off-by: Deepak Narayanan <[email protected]>

Modify throughput equations for hybrid models to account for --moe-la…

8a9f127

…tent-size Signed-off-by: Deepak Narayanan <[email protected]>

Move fc2_latent_proj into combine method to make sure tensors are the…

4463081

… right shape Signed-off-by: Deepak Narayanan <[email protected]>

deepakn94 force-pushed the dnarayanan/latent_moe branch from b0a2d8c to 4463081 Compare November 22, 2025 00:34

copy-pr-bot bot temporarily deployed to nemo-ci November 22, 2025 00:34 Inactive

copy-pr-bot bot had a problem deploying to nemo-ci November 22, 2025 00:34 Failure

copy-pr-bot bot temporarily deployed to test November 22, 2025 00:35 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci November 22, 2025 00:36 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci November 22, 2025 00:40 Inactive

copy-pr-bot bot temporarily deployed to nemo-ci November 22, 2025 00:58 Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Changes to support latent MoEs #2296

Changes to support latent MoEs #2296

Uh oh!

deepakn94 commented Nov 19, 2025

Uh oh!

venmugil Nov 21, 2025

Uh oh!

deepakn94 Nov 22, 2025

Uh oh!

venmugil Nov 21, 2025

Uh oh!

deepakn94 Nov 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Changes to support latent MoEs #2296

Are you sure you want to change the base?

Changes to support latent MoEs #2296

Uh oh!

Conversation

deepakn94 commented Nov 19, 2025

Uh oh!

venmugil Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

deepakn94 Nov 22, 2025

Choose a reason for hiding this comment

Uh oh!

venmugil Nov 21, 2025

Choose a reason for hiding this comment

Uh oh!

deepakn94 Nov 22, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants