Skip to content

Commit bed8374

Browse files
committed
Add a few more tests & logical operators; make a README.md more useful for users
1 parent 0f2d2a4 commit bed8374

File tree

5 files changed

+331
-39
lines changed

5 files changed

+331
-39
lines changed

.github/workflows/linux-tests.yml

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
11
name: Linux Tests
22

3-
43
on:
5-
push:
6-
branches: [ main, master ]
7-
pull_request:
8-
branches: [ main, master ]
4+
push:
5+
branches: [ main, master ]
6+
paths:
7+
- 'SIMD.h'
8+
- 'tests/simd_tests.cpp'
99

1010
permissions:
11-
contents: write
11+
contents: write
1212

1313
jobs:
1414
build-and-test:

.github/workflows/windows-tests.yml

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,10 @@ name: Windows Tests
33
on:
44
push:
55
branches: [ main, master ]
6-
pull_request:
7-
branches: [ main, master ]
6+
paths:
7+
- 'SIMD.h'
8+
- 'tests/simd_tests.cpp'
9+
810
permissions:
911
contents: write
1012

README.md

Lines changed: 108 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,19 @@
11
# SIMD Operations Framework
22

3-
A high-performance C++ framework for SIMD (Single Instruction Multiple Data) operations, providing optimized vector math operations for both floating-point and integer data types.
3+
A high-performance C++ framework for SIMD (Single Instruction Multiple Data) operations, providing optimized vector math operations for both floating-point and integer data types. This library significantly accelerates data-parallel computations by leveraging CPU SIMD instruction sets.
44

55
## Features
66

77
- Optimized SIMD operations for different data types:
8-
- `Int128`, `Int256` and `Int512` for integer operations (with `int8_t`, `int16_t`, and `int32_t`)
8+
- `Int128`, `Int256`, and `Int512` for integer operations (with `int8_t`, `int16_t`, and `int32_t`)
99
- `Float256` and `Float512` for floating-point operations
1010
- `Double256` and `Double512` for double-precision operations
1111
- Standard mathematical operations:
1212
- Addition
1313
- Subtraction
1414
- Multiplication
1515
- Division (for floating-point and double-precision)
16+
- Equality comparison
1617
- Automatic vectorization with significant performance improvements
1718
- Comprehensive test suite using Google Test
1819
- Performance benchmarks using Google Benchmark
@@ -39,26 +40,107 @@ Performance improvements comparing SIMD operations vs. standard operations on di
3940

4041
### Prerequisites
4142

42-
- C++ compiler with SIMD support (AVX2 recommended)
43+
- C++ compiler with SIMD support (tests verified on GCC and MSVC)
44+
- CPU with support for relevant instruction sets (SSE2, AVX2, AVX512)
4345

44-
### Using the Project
46+
### User Guide
4547

46-
This is a header only project, no need for building.
47-
```bash
48-
#include<SIMD.h>
48+
This is a header-only library, no building required:
49+
50+
```c++
51+
#include <SIMD.h>
52+
```
53+
54+
The available SIMD types will be detected by your IDE via IntelliSense. A basic runtime check is also implemented in SIMD types.
55+
56+
The library uses the namespace 'SIMD', but you can customize this before including the header:
57+
58+
```c++
59+
#define BASIC_SIMD_NAMESPACE MySimdNamespace
60+
#include <SIMD.h>
4961
```
50-
Should be sufficient.
5162
52-
### Running Tests & Benchmarks
63+
### Available Types
64+
65+
The following SIMD types are available on compatible CPU and compiler configurations:
66+
67+
| SIMD Type | Container Type | Width | Type of Each Element | Description | ISA Extension |
68+
|:---------:|:--------------:|:-----:|:--------------------:|:-----------:|:-------------:|
69+
| `SIMD::int_128<int8_t>` | Integer | 128 Bit | `int8_t` | Stores 16 elements, each `int8_t` | SSE2 |
70+
| `SIMD::int_128<int16_t>` | Integer | 128 Bit | `int16_t` | Stores 8 elements, each `int16_t` | SSE2 |
71+
| `SIMD::int_128<int32_t>` | Integer | 128 Bit | `int32_t` | Stores 4 elements, each `int32_t` | SSE2 |
72+
| `SIMD::int_128<int64_t>` | Integer | 128 Bit | `int64_t` | Stores 2 elements, each `int64_t` | SSE2 |
73+
| `SIMD::int_256<int8_t>` | Integer | 256 Bit | `int8_t` | Stores 32 elements, each `int8_t` | AVX2 |
74+
| `SIMD::int_256<int16_t>` | Integer | 256 Bit | `int16_t` | Stores 16 elements, each `int16_t` | AVX2 |
75+
| `SIMD::int_256<int32_t>` | Integer | 256 Bit | `int32_t` | Stores 8 elements, each `int32_t` | AVX2 |
76+
| `SIMD::int_256<int64_t>` | Integer | 256 Bit | `int64_t` | Stores 4 elements, each `int64_t` | AVX2 |
77+
| `SIMD::int_512<int8_t>` | Integer | 512 Bit | `int8_t` | Stores 64 elements, each `int8_t` | AVX512F or AVX512BW |
78+
| `SIMD::int_512<int16_t>` | Integer | 512 Bit | `int16_t` | Stores 32 elements, each `int16_t` | AVX512F or AVX512BW |
79+
| `SIMD::int_512<int32_t>` | Integer | 512 Bit | `int32_t` | Stores 16 elements, each `int32_t` | AVX512F |
80+
| `SIMD::int_512<int64_t>` | Integer | 512 Bit | `int64_t` | Stores 8 elements, each `int64_t` | AVX512F |
81+
| `SIMD::float_256` | Float | 256 Bit | `float` | Stores 8 elements, each `float` | AVX512F |
82+
| `SIMD::float_512` | Float | 512 Bit | `float` | Stores 16 elements, each `float` | AVX512F |
83+
| `SIMD::double_256` | Float | 256 Bit | `double` | Stores 4 elements, each `double` | AVX512F |
84+
| `SIMD::double_512` | Float | 512 Bit | `double` | Stores 8 elements, each `double` | AVX512F |
85+
86+
## Usage Examples
87+
88+
Each SIMD type provides several operation modes to ensure optimal performance for different use cases:
89+
90+
### Basic Operations
91+
92+
```c++
93+
// Initialize with values
94+
SIMD::int_128<int8_t> a(1, 2, 3, 4, 5, 6); // Initialize with values
95+
SIMD::int_128<int8_t> b; // Initialized to zero by default
96+
97+
// Standard operator syntax
98+
SIMD::int_128<int8_t> result = a + b;
99+
100+
// Or using explicit methods
101+
SIMD::int_128<int8_t> result2 = SIMD::int_128<int8_t>::Add(a, b);
102+
```
103+
104+
### In-place Operations
105+
106+
```c++
107+
SIMD::int_128<int8_t> a(1, 2, 3, 4, 5, 6);
108+
SIMD::int_128<int8_t> b(10, 20, 30, 40, 50, 60);
109+
110+
// Using operator
111+
a += b;
112+
113+
// Or explicit method
114+
SIMD::int_128<int8_t>::AddInplace(a, b);
115+
```
53116
54-
Use the provided batch script:
117+
### Raw Memory Operations
55118
56-
#### Windows
119+
For cases where you're working with existing aligned memory:
120+
121+
```c++
122+
// Ensure proper alignment
123+
alignas(SIMD::int_128<int8_t>::Alignment) int8_t a[16];
124+
alignas(SIMD::int_128<int8_t>::Alignment) int8_t b[16];
125+
126+
// Fill arrays with data...
127+
128+
// Perform SIMD operation directly on memory
129+
SIMD::int_128<int8_t>::AddInplaceRaw(a, b); // a += b
130+
```
131+
132+
For large arrays, it's recommended to use an aligned dynamic memory allocator. The `AlignedMemory` namespace included with SIMD.h provides this functionality.
133+
134+
## Running Tests & Benchmarks
135+
136+
Use the provided scripts to run tests and benchmarks:
137+
138+
### Windows
57139
```bash
58140
run_tests.bat
59141
```
60142

61-
#### Linux
143+
### Linux
62144
```bash
63145
run_tests.sh
64146
```
@@ -69,10 +151,20 @@ This project is licensed under the GPLv3 License - see the LICENSE file for deta
69151

70152
## Contributing
71153

72-
Current state of the project is purely based on the personal needs, any contribution for extending SIMD support is appreciated & welcome.
154+
Contributions to extend SIMD support are welcome! The project started based on personal needs but could benefit from community involvement.
155+
156+
### How to Contribute
157+
158+
1. Fork the repository
159+
2. Create a feature branch
160+
3. Implement your changes
161+
4. Add or update tests as appropriate
162+
5. Submit a pull request
163+
164+
The SIMD.h implementation uses macros to minimize repetitive code. The basic pattern is to define a macro for a new operator/function as a specialization of the base SIMD_Type_t, then apply it for different bit widths.
73165

74-
You can always contact me via e-mail from: [email protected]
166+
The SIMD::Array class provided in the codebase serves as a good example for building custom SIMD classes efficiently.
75167

76-
The SIMD.h uses macros a lot to eliminate manual work of repetitive coding, but the idea is really simple; define a macro for a new operator/function as a specialization of base SIMD_Type_t and then use it for different bit widths.
168+
### Contact
77169

78-
The SIMD::Array class is created for testing, but I believe it is a good example for building your own SIMD classes efficiently.
170+
For questions or suggestions, please contact: [email protected]

0 commit comments

Comments
 (0)