Skip to content

Commit 3c9fd69

Browse files
committed
use block_current_stream work api
Summary: use block_current_stream to avoid blocking the cpu when using gloo Test Plan: ## Before <img width="1285" height="667" alt="image" src="https://github.com/user-attachments/assets/082b55ce-efac-46f8-adba-df92ffec864f" /> ## After <img width="1283" height="662" alt="image" src="https://github.com/user-attachments/assets/923bcc7b-740a-43d8-baed-19e3d77a7889" />
1 parent be3e833 commit 3c9fd69

File tree

3 files changed

+3
-6
lines changed

3 files changed

+3
-6
lines changed

.github/workflows/lint.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ jobs:
2727
lintrunner init
2828
2929
pip install .[dev] -v
30+
pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu128
3031
- name: Run lintrunner
3132
run: |
3233
set -eux

.github/workflows/unittest.yaml

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -15,11 +15,7 @@ jobs:
1515
- runs-on: "linux.2xlarge"
1616
gpu-arch-type: "cpu"
1717
gpu-arch-version: ""
18-
torch-version: "stable"
19-
- runs-on: "linux.g5.12xlarge.nvidia.gpu"
20-
gpu-arch-type: "cuda"
21-
gpu-arch-version: "12.4"
22-
torch-version: "stable"
18+
torch-version: "nigthly"
2319
- runs-on: "linux.g5.12xlarge.nvidia.gpu"
2420
gpu-arch-type: "cuda"
2521
gpu-arch-version: "12.4"

torchft/manager.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -373,7 +373,7 @@ def allreduce(
373373
)
374374
else:
375375
work = self._pg.allreduce([tensor], ReduceOp.SUM)
376-
work.wait()
376+
work.block_current_stream()
377377
fut = work.get_future()
378378

379379
stream: Optional[torch.cuda.Stream] = (

0 commit comments

Comments
 (0)