You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Summary:
Pull Request resolved: #540
What's going on here:
1. `RemoteProcessAlloc` is instantiated in the client code (ex. https://fburl.com/code/p4t5aewo)
2. `RemoteProcessAlloc::new()` now spawns a signal handler and holds onto the `JoinHandle`. A tx-rx pair is created so that the signal handler task is aware of the addresses of hosts in `RemoteProcessAlloc::host_states` as they are added and removed
3. `RemoteProcessAlloc::host_states` is now wrapped in a struct `HostStates` which contains the tx side and aims to have the same interface as a `HashMap` but sends updates to the map over the tx. When a `RemoteProcessAllocHostState` is inserted, the address and `HostId` is sent over the tx. When a `RemoteProcessAllocHostState` is removed, the `HostId` is sent over the tx (address is None).
4. When the handler receives a `HostId` and `Some(ChannelAddr)` it will dial this address, and insert the `ChannelTx` into it's own `HashMap` with the `HostId` as the key
5. When the handler receives a `HostId` and `None`, it will remove the corresponding entry from it's `HashMap`
6. When the handler receives a signal, it will iterate over all `ChannelTx`s in the `HashMap` and send `RemoteProcessAllocatorMessage::Signal(signal)` over each `ChannelTx` to the `RemoteProcessAllocator` running on a remote machine
7. The`RemoteProcessAllocator` receives the message. If the signal == SIGINT, it calls `ensure_previous_alloc_stopped` to stop gracefully, then reraises the signal
Reviewed By: moonli
Differential Revision: D78097380
Copy file name to clipboardExpand all lines: hyperactor_mesh/Cargo.toml
+11-1Lines changed: 11 additions & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -1,4 +1,4 @@
1
-
# @generated by autocargo from //monarch/hyperactor_mesh:[hyperactor_mesh,hyperactor_mesh_test_bootstrap,process_allocator_cleanup,process_allocator_test_bin,process_allocator_test_bootstrap]
1
+
# @generated by autocargo from //monarch/hyperactor_mesh:[hyperactor_mesh,hyperactor_mesh_test_bootstrap,hyperactor_mesh_test_remote_process_alloc,hyperactor_mesh_test_remote_process_allocator,process_allocator_cleanup,process_allocator_test_bin,process_allocator_test_bootstrap]
2
2
3
3
[package]
4
4
name = "hyperactor_mesh"
@@ -11,6 +11,14 @@ license = "BSD-3-Clause"
11
11
name = "hyperactor_mesh_test_bootstrap"
12
12
path = "test/bootstrap.rs"
13
13
14
+
[[bin]]
15
+
name = "hyperactor_mesh_test_remote_process_alloc"
16
+
path = "test/remote_process_alloc.rs"
17
+
18
+
[[bin]]
19
+
name = "hyperactor_mesh_test_remote_process_allocator"
0 commit comments