Skip to content

[CIR][CIRGen][Builtin][X86] Lower AVX masked load intrinsics #1763

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 29, 2025

Conversation

RiverDave
Copy link
Collaborator

@RiverDave RiverDave commented Jul 27, 2025

For these intrinsics there only seems to be one function where the IR emmited seems to diverge:

for _mm_load_sbh loads a single 16-bit bfloat (__bf16) value from memory into the lowest element of a 128-bit bfloat vector (__m128bh), leaving the remaining lanes unchanged or filled with a passthrough value. It is implemented using a masked load with only the first lane enabled.

source for intrinsics with similar behaviour

In the CIR lowering of _mm_load_sbh, we are currently emitting the mask of intrinsic (llvm.masked.load) operand as an explicit constant vector:

<8 x i1> <true, false, false, false, false, false, false, false>

whereas OG lowers:

<8 x i1> bitcast (<1 x i8> splat (i8 1) to <8 x i1>)

I believe both things are semantically equal so:

Is it acceptable for CIR and OG to diverge in this way for masked loads, or should we aim for parity in how the mask is represented, even if that reduces readability in CIR?

@bcardosolopes
Copy link
Member

bcardosolopes commented Jul 28, 2025

Is it acceptable for CIR and OG to diverge in this way for masked loads, or should we aim for parity in how the mask is represented, even if that reduces readability in CIR?

Sine they are semantically equivalent it's ok if we diverge in cases like this (the change needed is orthogonal to this PR). However, we'd like to converge eventually, in the meantime we should track it in an issue (so someone can do follow up work that touches parts not related with this PR) and also writing a comment in the tests on how this looks like in OG. (Btw, we have a issue label for IR-differences).

@bcardosolopes
Copy link
Member

Needs fixing conflict resolution

@RiverDave
Copy link
Collaborator Author

RiverDave commented Jul 29, 2025

Is it acceptable for CIR and OG to diverge in this way for masked loads, or should we aim for parity in how the mask is represented, even if that reduces readability in CIR?

Sine they are semantically equivalent it's ok if we diverge in cases like this (the change needed is orthogonal to this PR). However, we'd like to converge eventually, in the meantime we should track it in an issue (so someone can do follow up work that touches parts not related with this PR) and also writing a comment in the tests on how this looks like in OG. (Btw, we have a issue label for IR-differences).

Sounds amazing! Looking forward on working on those. Related to this I've opened #1767 (I don't have permissions to add the label yet). I've also added a brief note on the test referencing the issue as suggested.

@bcardosolopes bcardosolopes merged commit 46686b7 into llvm:main Jul 29, 2025
6 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants