-
Notifications
You must be signed in to change notification settings - Fork 22
Description
RapidsMPF needs two kinds of reduction operations:
- Fixed-size reduction
- Dynamic-size reduction over packed data
These two cases have different constraints and therefore belong at different layers of the stack.
Fixed-size reduction
Fixed-size reductions operate on buffers with a known size and layout (scalars, POD structs, fixed arrays).
Both MPI and UCX+UCC provide native reduce / allreduce for this case.
Because of this, fixed-size reduce / all_reduce should be part of the Communicator API:
- MPI backend →
MPI_Allreduce/MPI_Reduce - UCX/UCC backend → UCC
ALLREDUCEcollectives
This gives us efficient, backend-optimized collectives for simple fixed-size data.
However, UCXX currently does not expose UCC support, so enabling UCC-based collectives may require significant work (maybe a lot of work, @pentschev?). As an intermediate step, the UCXX communicator in RapidsMPF could implement reduce using the existing send/recv primitives.
Dynamic-size reduction (packed data)
Packed data varies in both size and structure, and no backend (MPI or UCX/UCC) supports all-reduce for variable-length payloads.
Supporting this case requires a custom protocol implemented above the communicator:
- exchange sizes
- reserve/allocate buffers (with spilling if needed)
- apply a user-provided reduction operator
For this reason, dynamic-size reductions should not be part of the low-level Communicator, but instead implemented in higher-level RapidsMPF logic.
Questions
- Do we agree that fixed-size reduction is the highest priority?
- Do we need user-defined reduction operators for fixed-size reduction, or is it sufficient to provide basic built-in operators (ADD, MUL, MIN, MAX) on fundamental datatypes such as
intandfloat?