Skip to content

[bugfix] fix glm5.2 indexer_type 'shared'#134

Merged
Jintao-Huang merged 5 commits into
modelscope:mainfrom
Jintao-Huang:fix_glm_5_indexer_shared
Jun 23, 2026
Merged

[bugfix] fix glm5.2 indexer_type 'shared'#134
Jintao-Huang merged 5 commits into
modelscope:mainfrom
Jintao-Huang:fix_glm_5_indexer_shared

Conversation

@Jintao-Huang

Copy link
Copy Markdown
Collaborator

No description provided.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the GPT bridge to conditionally handle the indexer state dict update when using the 'dsa' experimental attention variant by checking and reducing the indexer's presence across the pipeline parallel group. Feedback suggests using defensive programming with getattr to avoid potential AttributeError exceptions when accessing core_attention and indexer on mg_attn.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread src/mcore_bridge/bridge/gpt_bridge.py
@Jintao-Huang Jintao-Huang changed the title [bugfix] fix glm5 indexer shared [bugfix] fix glm5.2 indexer_type 'shared' Jun 23, 2026
@Jintao-Huang

Copy link
Copy Markdown
Collaborator Author

/gemini review

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for GLM 5.2 with a shared DSA (Dense Sparse Attention) indexer by adding the glm_moe_dsa model implementation. This includes custom attention, GPT model, transformer block, and loader classes to manage shared indexer states across pipeline parallel stages, as well as bridge updates for handling indexer states. The feedback suggests defensive retrieval of indexer_types to prevent potential AttributeErrors and specifying the dimension in squeeze() on the attention mask to avoid unintended squeezing of the batch dimension when the batch size is 1.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.


def __init__(self, config, submodules, layer_number, *args, **kwargs):
super().__init__(config, submodules, layer_number, *args, **kwargs)
indexer_types = config.hf_config.indexer_types

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To prevent potential AttributeError if config.hf_config is None, it is safer to use getattr to retrieve indexer_types defensively.

Suggested change
indexer_types = config.hf_config.indexer_types
indexer_types = getattr(getattr(config, 'hf_config', None), 'indexer_types', None)

else:
b = query.size(1)
assert attention_mask.shape == (b, 1, sq, skv)
mask = attention_mask.squeeze()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using attention_mask.squeeze() without specifying a dimension is dangerous because if the batch size b is 1, it will squeeze the batch dimension as well, leading to inconsistent tensor shapes (e.g., (sq, skv) instead of (1, sq, skv)). Specifying the dimension to squeeze (dimension 1, which is the singleton dimension for attention heads/groups) ensures consistent shapes regardless of the batch size.

Suggested change
mask = attention_mask.squeeze()
mask = attention_mask.squeeze(1)

def get_transformer_layer_spec(self, vp_stage: Optional[int] = None):
transformer_layer_spec = super().get_transformer_layer_spec(vp_stage)

indexer_types = self.config.hf_config.indexer_types

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

To prevent potential AttributeError if self.config.hf_config is None, it is safer to use getattr to retrieve indexer_types defensively.

Suggested change
indexer_types = self.config.hf_config.indexer_types
indexer_types = getattr(getattr(self.config, 'hf_config', None), 'indexer_types', None)

@Jintao-Huang Jintao-Huang merged commit 86312fc into modelscope:main Jun 23, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants