Skip to content

Conversation

@libinta
Copy link
Contributor

@libinta libinta commented Jul 17, 2025

No description provided.

Comment on lines 98 to +101
if found_bucket is None:
block_size = self.slice_size if self.use_window_sdpa else self.block_size
new_batch_size = 2 ** math.ceil(math.log2(batch_size))
new_seq_len = math.ceil(seq_len / self.block_size) * self.block_size
new_seq_len = math.ceil(seq_len / block_size) * block_size
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is hacky and doesn't really fix the real problem why that bucket hasn't been found.
You can try to change block_size to be a multiple of 1k if that's a hard requirement from FSDPA

padding_side ='left'
args.append(window_size)

args = [query, key, value, attn_bias, 0.0, is_causal,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change related to missing bucket? I think you could extract it into another PR so we could merge if not

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants