Skip to content

Conversation

thomasywang
Copy link
Contributor

Summary:
ActorMesh's shape might have large extents on some dimensions. Those dimensions would cause large fanout in our comm actor
implementation. To avoid that, we reshape it by increasing dimensionality and limiting the extent of each dimension. Note: the reshape is only visibility to the internal algorithom. Theshape that user sees maintains intact.

For example, a typical shape is [hosts=1024, gpus=8]. By using limit 8, it becomes [8, 8, 8, 2, 8] during casting. In other words, it adds 3 extra layers to the comm actor tree, while keeping the fanout in each layer at 8 or smaller.

The limit for cast fanouts will be configured by the key CASTING_FANOUT_SIZE which is currently set to 0 as default disabling the feature.

Differential Revision: D82320948

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Sep 12, 2025
@facebook-github-bot
Copy link
Contributor

@thomasywang has exported this pull request. If you are a Meta employee, you can view the originating diff in D82320948.

@facebook-github-bot
Copy link
Contributor

@thomasywang has exported this pull request. If you are a Meta employee, you can view the originating diff in D82320948.

thomasywang added a commit to thomasywang/monarch-1 that referenced this pull request Sep 19, 2025
Summary:

ActorMesh's shape might have large extents on some dimensions. Those dimensions would cause large fanout in our comm actor
implementation. To avoid that, we reshape it by increasing dimensionality and limiting the extent of each dimension. Note: the reshape is only visibility to the internal algorithom. Theshape that user sees maintains intact.

For example, a typical shape is [hosts=1024, gpus=8]. By using limit 8, it becomes [8, 8, 8, 2, 8] during casting. In other words, it adds 3 extra layers to the comm actor tree, while keeping the fanout in each layer at 8 or smaller.

The limit for cast fanouts will be configured by the key `CASTING_FANOUT_SIZE` which is currently set to 0 as default disabling the feature.

Differential Revision: D82320948
@facebook-github-bot
Copy link
Contributor

@thomasywang has exported this pull request. If you are a Meta employee, you can view the originating diff in D82320948.

@facebook-github-bot
Copy link
Contributor

@thomasywang has exported this pull request. If you are a Meta employee, you can view the originating diff in D82320948.

thomasywang added a commit to thomasywang/monarch-1 that referenced this pull request Sep 19, 2025
Summary:

ActorMesh's shape might have large extents on some dimensions. Those dimensions would cause large fanout in our comm actor
implementation. To avoid that, we reshape it by increasing dimensionality and limiting the extent of each dimension. Note: the reshape is only visibility to the internal algorithom. Theshape that user sees maintains intact.

For example, a typical shape is [hosts=1024, gpus=8]. By using limit 8, it becomes [8, 8, 8, 2, 8] during casting. In other words, it adds 3 extra layers to the comm actor tree, while keeping the fanout in each layer at 8 or smaller.

The limit for cast fanouts will be configured by the key `CASTING_FANOUT_SIZE` which is currently set to 0 as default disabling the feature.

Differential Revision: D82320948
@facebook-github-bot
Copy link
Contributor

@thomasywang has exported this pull request. If you are a Meta employee, you can view the originating diff in D82320948.

thomasywang added a commit to thomasywang/monarch-1 that referenced this pull request Sep 19, 2025
Summary:

ActorMesh's shape might have large extents on some dimensions. Those dimensions would cause large fanout in our comm actor
implementation. To avoid that, we reshape it by increasing dimensionality and limiting the extent of each dimension. Note: the reshape is only visibility to the internal algorithom. Theshape that user sees maintains intact.

For example, a typical shape is [hosts=1024, gpus=8]. By using limit 8, it becomes [8, 8, 8, 2, 8] during casting. In other words, it adds 3 extra layers to the comm actor tree, while keeping the fanout in each layer at 8 or smaller.

The limit for cast fanouts will be configured by the key `CASTING_FANOUT_SIZE` which is currently set to 0 as default disabling the feature.

Differential Revision: D82320948
thomasywang added a commit to thomasywang/monarch-1 that referenced this pull request Sep 19, 2025
Summary:

ActorMesh's shape might have large extents on some dimensions. Those dimensions would cause large fanout in our comm actor
implementation. To avoid that, we reshape it by increasing dimensionality and limiting the extent of each dimension. Note: the reshape is only visibility to the internal algorithom. Theshape that user sees maintains intact.

For example, a typical shape is [hosts=1024, gpus=8]. By using limit 8, it becomes [8, 8, 8, 2, 8] during casting. In other words, it adds 3 extra layers to the comm actor tree, while keeping the fanout in each layer at 8 or smaller.

The limit for cast fanouts will be configured by the key `CASTING_FANOUT_SIZE` which is currently set to 0 as default disabling the feature.

Differential Revision: D82320948
Summary:

ActorMesh's shape might have large extents on some dimensions. Those dimensions would cause large fanout in our comm actor
implementation. To avoid that, we reshape it by increasing dimensionality and limiting the extent of each dimension. Note: the reshape is only visibility to the internal algorithom. Theshape that user sees maintains intact.

For example, a typical shape is [hosts=1024, gpus=8]. By using limit 8, it becomes [8, 8, 8, 2, 8] during casting. In other words, it adds 3 extra layers to the comm actor tree, while keeping the fanout in each layer at 8 or smaller.

The limit for cast fanouts will be configured by the key `CASTING_FANOUT_SIZE` which is currently set to 0 as default disabling the feature.

Differential Revision: D82320948
@facebook-github-bot
Copy link
Contributor

@thomasywang has exported this pull request. If you are a Meta employee, you can view the originating diff in D82320948.

thomasywang added a commit to thomasywang/monarch-1 that referenced this pull request Sep 19, 2025
Summary:

ActorMesh's shape might have large extents on some dimensions. Those dimensions would cause large fanout in our comm actor
implementation. To avoid that, we reshape it by increasing dimensionality and limiting the extent of each dimension. Note: the reshape is only visibility to the internal algorithom. Theshape that user sees maintains intact.

For example, a typical shape is [hosts=1024, gpus=8]. By using limit 8, it becomes [8, 8, 8, 2, 8] during casting. In other words, it adds 3 extra layers to the comm actor tree, while keeping the fanout in each layer at 8 or smaller.

The limit for cast fanouts will be configured by the key `CASTING_FANOUT_SIZE` which is currently set to 0 as default disabling the feature.

Differential Revision: D82320948
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot. fb-exported meta-exported
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants