NCCL v2.28.7 Released! #1879
Pinned
gab9talavera
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
See the NCCL 2.28.7 Release Notes and documentation for more information. Examples will be updated in the next few days.
GPU-Initiated Networking (GIN)
capability into application kernels.
kernel >= 6.1 is required.
For some current examples please see nccl-tests with GIN.
New ncclCommRevoke API for fault tolerance
communicator without freeing resources.
collectives and bring a communicator to a safe state before
split/shrink/finalize/destroy.
supports blocking/non-blocking usage.
New NCCL Environment Plugin
example, after loading them from a centralized database.
environment plugin.
New NCCL Examples on GitHub
practical code samples that highlight NCCL’s core features.
point-to-point communication, and collective operations, as well as
advanced features such as user buffer registration, symmetric memory,
and the device API.
Device API improvements
functionality.
availability of symmetric memory.
Enhanced NCCL RAS output
metrics collection.
automated analysis systems.
Github Pull Requests resolved
(PR NCCL Fast Init - CPU Optimizations for NCCL Initialization Large Scale #1789)
sending bootstrap information bidirectionally. (PR NCCL Fast Init - Improve Bootstrap AllGather by 2x at large scale #1791)
NCCL-2.28.3 because error is not drained, but rather gets propagated
into the next CUDA kernel invocation. (PR Drain error code when kernel is not found #1864)
Other notable improvements
registrations, which could lead to crashes. Avoids such registration
attempts in case of the use of incompatible memory allocators.
small messages with size granularity under 8 bytes or when multiple
symmetric operations were aggregated in a group.
un-even GPU count per node.
split mask settings, where NCCL cannot find a viable ring.
running on hosts with certain specific older CUDA drivers.
Beta Was this translation helpful? Give feedback.
All reactions