[Microbenchmarks] Add benchmark for conditional scalar assignment autovec #295

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

huntergr-arm wants to merge 3 commits into llvm:main from huntergr-arm:conditional-scalar-assignment-microbenchmark

+223 −0

huntergr-arm commented Nov 13, 2025

Benchmarks with vs. without autovec for a loop containing conditional
scalar assignment (plus a little extra arithmetic as a 'work payload').


          [Microbenchmarks] Add benchmark for conditional scalar assignment aut…

489b92e

…ovec

Benchmarks with vs. without autovec for a loop containing conditional
scalar assignment (plus a little extra arithmetic as a 'work payload').

huntergr-arm requested review from MacDue, fhahn and sdesmalen-arm

November 13, 2025 13:51

Author

huntergr-arm commented Nov 13, 2025

Microbenchmark for FindLast/CSA autovec, as requested on llvm/llvm-project#158088

With just the conditional assignment in the loop, there was no noticeable performance difference. However, when I added a small arithmetic payload I saw a noticeable difference, especially for uint8t.

MacDue approved these changes

View reviewed changes

Member

MacDue left a comment

Generally seems reasonable to me (bar a few nits), but I've not added a benchmark before, so wait and see if there's any more comments.

MicroBenchmarks/LoopVectorization/ConditionalScalarAssignment.cpp Outdated

Comment on lines 14 to 15

		// Pick out-of-range default value.
		T Result = 101;

Member

MacDue Nov 13, 2025

Took a moment to see that "out-of-range" here was referring to the range of the input A values. Could you clarify that?

Author

huntergr-arm Nov 13, 2025

done

MicroBenchmarks/LoopVectorization/ConditionalScalarAssignment.cpp

		@@ -0,0 +1,118 @@
		#include <iostream>

Member

MacDue Nov 13, 2025 •

edited

Loading

Was going to comment about the license header, but it seems that's not done here (looking at other files).

Author

huntergr-arm Nov 13, 2025

Yeah, I wondered about that too.

MicroBenchmarks/LoopVectorization/ConditionalScalarAssignment.cpp Outdated

Comment on lines 78 to 82

+                  run_csa_autovec(&A[0], &B[0], &C[0], Threshold);
+                  benchmark::DoNotOptimize(A);
+                  benchmark::DoNotOptimize(B);
+                  benchmark::DoNotOptimize(C);
+                  benchmark::ClobberMemory();

Member

MacDue Nov 13, 2025

Not sure if it makes a difference, but other benchmarks seem to do these first:

Suggested change

      
                run_csa_autovec(&A[0], &B[0], &C[0], Threshold);
          
                benchmark::DoNotOptimize(A);
          
                benchmark::DoNotOptimize(B);
          
                benchmark::DoNotOptimize(C);
          
                benchmark::ClobberMemory();
          
                benchmark::DoNotOptimize(A);
          
                benchmark::DoNotOptimize(B);
          
                benchmark::DoNotOptimize(C);
          
                benchmark::ClobberMemory();
          
                run_csa_autovec(&A[0], &B[0], &C[0], Threshold);

Author

huntergr-arm Nov 13, 2025

MathFunctions.cpp and RuntimeChecks.cpp seem to run the test function first as well, so there doesn't seem to be agreement on this.


          Remove unnecessary headers, improve comments

03848a1

fhahn reviewed

View reviewed changes

MicroBenchmarks/LoopVectorization/ConditionalScalarAssignment.cpp

+                // for 'A' in init_data below.
+                T Result = 101;
+                for (unsigned i = 0; i < ITERATIONS; i++) {
+                  // Do some work to make the difference noticeable

Contributor

fhahn Nov 14, 2025

could you add a few more variations, like the minimal case with just a CAS and multiple independent CAS?

Author

huntergr-arm Nov 14, 2025

done.

MicroBenchmarks/LoopVectorization/ConditionalScalarAssignment.cpp Outdated

+                }
+              }
+              // Add add auto-vectorized and disabled vectorization benchmarks for math

Contributor

fhahn Nov 14, 2025

The comment needs updating, currently passes only ty and Threshold, but it might be helpful to also pass a function if it helps to reduce the duplication for additional patterns

Author

huntergr-arm Nov 14, 2025

done.


          Add single-csa-only and multi-csa-only variants, tidy up

bc3492f

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet