Experimental `std::simd` #6732

fbusato · 2025-11-22T01:47:18Z

Motivations

Modern GPU architectures are increasingly exposing fine-grained, single-thread SIMD capabilities to maximize throughput within individual CUDA threads. While GPU programming model strongly focuses on the SIMT model, newer hardware relies on specialized SIMD operations to saturate execution units. Some examples include:

int16_t SIMD instructions DPX.
FADDx2, FMULx2, FMAx2.
Bfloat16x2 and Halfx2 intrinsics.
3-Input floating-point Minimum / Maximum.
Integer Add X3 (IADD3).
Integer dot product __dp4a.
SIMD Video Instructions, e.g. vabsdiff4.
SIMD within a register (SWAR) for integer types

C++26 std::simd provides a standardized abstraction to write vectorized code. This is a great opportunity to unify customized code to handle all variants and reduce CUDA software fragmentation. By adopting std::simd-like API, developers can write a single vectorized kernel that compiles to the optimal instructions for any GPU architecture.

PR Goals and Non-Goals

The PR aims to provide a basic implementation of std::simd and provide the foundation for future optimizations and extensions.

Advanced math and bit operations, e.g. std::abs , std::pow, std::popcount etc. , as well as std::complex binding, are outside the scope of the first PR.

Non-Goals:

Fully-implement std::simd.
Implement custom ABIs to target host vector instructions.

Implementation Notes

The implementation is based on the LLVM code experimental/__simd and extended to support the related C++ proposals:

Some optimizations are already exploited in the CCCL code, for example thread_simd.h and thread_reduce.h. They will gradually added to the implementation.

Partially address #30

miscco

This is not using SIMD on the host, is there any reason for that?

fbusato · 2025-11-24T17:05:03Z

because this is the first PR. Secondly, because we care more about GPU than CPU. Third, the feature is also experimental for other std libraries.

github-actions · 2025-11-24T23:52:02Z

🥳 CI Workflow Results

🟩 Finished in 15m 27s: Pass: 100%/42 | Total: 2h 47m | Max: 14m 49s | Hits: 99%/20431

See results here.

fbusato added 18 commits November 20, 2025 11:37

more operations

0b4ccfe

fixes

9e46a30

other fixes

dd6599a

add simd_mask

a917c30

remove explicit

443837a

follow the standard

f702a01

reduce redundancy

d1d7bec

headers and explicit usage

21bb217

add simd_mask generator

c2a9d1c

fix initialization

523a7c4

formatting

8bf8783

reduce redundancy

e5e67bb

working unit test

4ebcb5d

fix semantic

489d3f3

add simd::scalar

b66bf7e

use bool as mask_storage

fd23872

simplify __simd_reference

3b3a50e

header cleanup

5458cf7

fbusato self-assigned this Nov 22, 2025

fbusato requested a review from a team as a code owner November 22, 2025 01:47

fbusato added the 3.2.0 Targeted for 3.2.0 release label Nov 22, 2025

fbusato added this to CCCL Nov 22, 2025

fbusato requested a review from ericniebler November 22, 2025 01:47

github-project-automation bot moved this to Todo in CCCL Nov 22, 2025

cccl-authenticator-app bot moved this from Todo to In Review in CCCL Nov 22, 2025

This comment has been minimized.

Sign in to view

miscco reviewed Nov 24, 2025

View reviewed changes

fbusato added 2 commits November 24, 2025 15:00

fix c++17

f672929

fix MSVC warning

2b5d84e

fbusato added 3 commits November 24, 2025 15:06

formatting

e7d976c

fix macro names

4c2d8b8

fix count() signature

9e4ba7c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Experimental `std::simd` #6732

Experimental `std::simd` #6732

fbusato commented Nov 22, 2025 •

edited

Loading

Uh oh!

This comment has been minimized.

miscco left a comment

Uh oh!

fbusato commented Nov 24, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Experimental std::simd #6732

Are you sure you want to change the base?

Experimental std::simd #6732

Conversation

fbusato commented Nov 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivations

PR Goals and Non-Goals

Implementation Notes

Uh oh!

This comment has been minimized.

miscco left a comment

Choose a reason for hiding this comment

Uh oh!

fbusato commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Nov 24, 2025

🥳 CI Workflow Results

🟩 Finished in 15m 27s: Pass: 100%/42 | Total: 2h 47m | Max: 14m 49s | Hits: 99%/20431

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Experimental `std::simd` #6732

Experimental `std::simd` #6732

fbusato commented Nov 22, 2025 •

edited

Loading

fbusato commented Nov 24, 2025 •

edited

Loading