Skip to content

Conversation

@Autumn1998
Copy link
Collaborator

  1. Add topo-detection for RDMA
  2. Optimize the perf in small hidden/large EP
  3. Use ~/.deepep/hybrid_ep/jit as the jit path
  4. Group 2 sync into 1, and move sync point before the dispatch
  5. dynamic seq len for RDMA
  6. fix some bugs

@dongwang4096
Copy link

Good job! It seems that there is an error in the position of the brackets, in topo_detection.cuh : Line 330-332 .

@Autumn1998
Copy link
Collaborator Author

Good job! It seems that there is an error in the position of the brackets, in topo_detection.cuh : Line 330-332 .

Fixed, this is also a bug on the old nccl version..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants