Skip to content

Conversation

@copybara-service
Copy link

@copybara-service copybara-service bot commented Nov 14, 2025

  • Add an argument robust_masking to the Softmax layer to enable better numerical handling of the mask (currently if the mask violates any of the assumptions it will do numerically silly things silently).
  • Plumb an argument that would opt into the usage of the new softmax layer for the official keras MultiHeadAttention layer and the model garden TransformerEncoderBlock layer.

@copybara-service copybara-service bot changed the title * Create a SoftmaxV2 layer which has better numerical handling of the mask (currently if the mask violates any of the assumptions it will do numerically silly things silently). * Add an argument robust_masking to the Softmax layer to enable better numerical handling of the mask (currently if the mask violates any of the assumptions it will do numerically silly things silently). Nov 14, 2025
@copybara-service copybara-service bot closed this Nov 14, 2025
@copybara-service copybara-service bot deleted the test_831896060 branch November 14, 2025 17:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants