Skip to content
This repository was archived by the owner on Nov 27, 2022. It is now read-only.
This repository was archived by the owner on Nov 27, 2022. It is now read-only.

User code vs benchmark monstrosities #9

Open
@leudz

Description

@leudz

I was playing with frag iteration and I made 3 versions:

  • V1 is a simple loop

time: [407.12 ns 409.32 ns 411.81 ns]

V1
self.0.run(|mut data: ViewMut<Data>| {
    (&mut data).iter().for_each(|data| {
        data.0 *= 2.0;
    })
});
  • V2 helps the compiler auto-vectorize

time: [165.88 ns 166.45 ns 167.05 ns]

V2
self.0.run(|mut data: ViewMut<Data>| {
    (&mut data)
        .iter()
        .into_chunk_exact(4)
        .unwrap_or_else(|_| panic!())
        .for_each(|chunk| {
            chunk[0].0 *= 2.0;
            chunk[1].0 *= 2.0;
            chunk[2].0 *= 2.0;
            chunk[3].0 *= 2.0;
        })
});
- V3 explicitly uses simd

time: [127.37 ns 129.08 ns 131.26 ns]

V3
use core::arch::x86_64::*;

unsafe {
    let delta = _mm_set1_ps(2.0);

    self.0.run(|mut data: ViewMut<Data>| {
        (&mut data)
            .iter()
            .into_chunk_exact(4)
            .unwrap_or_else(|_| panic!())
            .for_each(|chunk| {
                let simd_chunk = _mm_loadu_ps(chunk as *const _ as *const _);
    
                _mm_mul_ps(simd_chunk, delta);
    
                _mm_storeu_ps(chunk as *mut _ as *mut _, simd_chunk);
            })
    });
}

V2 and V3 will likely not be used by many people (if any). And the time is ridiculously small either way.

My question is: should the benchmarks only use code that users would use, try to optimize as much as possible or somewhere in-between?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions