Skip to content

Conversation

@svenweb
Copy link

@svenweb svenweb commented Nov 27, 2025

Hello,

This PR increases the speed of Length.cpp by 1.13x - 2.83x by enabling the compiler to use SIMD vector instructions with SIMD directives. This is done safely by checking #ifdef HAVE_OPEN_SIMD so if the preprocessor macro HAVE_OPEN_SIMD has not been defined then the Length.cpp runs as usual.

On x86 Linux 32GiB DRAM, Intel i9 performance varied by size of line (# of points) and whether line was a CoordinateSequence or CoordinateXY vector.

`
Points | Vector Gain | CoordinateSequence Gain

      10 |         -- |           --
     100 |         -- |           --
    1000 |      1.67x |        1.67x
   10,000 |      2.83x |        2.33x
  100,000 |      1.83x |        1.96x
 1,000,000 |      1.75x |        1.48x
10,000,000 |      1.24x |        1.13x

`

The speed and throughput testing script I used:
myLengthTest.cpp

I built and ran ctest on both the x86 Linux and an M1 Mac, passed all tests.

Thank you,
Sven

@pramsey
Copy link
Member

pramsey commented Nov 27, 2025

What would cause HAVE_OPEN_SIMD to be set though? Shouldn't there be an accompanying check in cmake or is HAVE_OPEN_SIMD just something intrinsic to some compilers?

@pramsey
Copy link
Member

pramsey commented Nov 27, 2025

Reading on this simd directive, it sounds like in general compilers are already vectorizing pretty automatically. Does your code change (removing the pt0->pt1 assignment) without the simd directive end up vectorized anyways?

@gregbaker
Copy link

in general compilers are already vectorizing pretty automatically

This code can't be fully vectorized because the compiler is obliged to do the additions in the order specified to preserve any rounding error to be exactly what you asked for. Effectively it must implement (((l0+l1)+l2)+l3)+l4. The pragma gives it permission to treat the + as commutative and associative, allowing the automatic vectorization to happen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants