Skip to content

Commit 75517c8

Browse files
[2.6] Update WandB code and example (#3429)
Use job api for WandB example. Fix WandBReceiver bug. ### Description Use job api for WandB example. Fix WandBReceiver bug. ### Types of changes <!--- Put an `x` in all the boxes that apply, and remove the not applicable items --> - [x] Non-breaking change (fix or new feature that would not break existing functionality). - [ ] Breaking change (fix or new feature that would cause existing functionality to change). - [ ] New tests added to cover the changes. - [ ] Quick tests passed locally by running `./runtest.sh`. - [ ] In-line docstrings updated. - [ ] Documentation updated. --------- Co-authored-by: Chester Chen <[email protected]>
1 parent 90465f0 commit 75517c8

File tree

11 files changed

+439
-375
lines changed

11 files changed

+439
-375
lines changed

examples/advanced/experiment-tracking/wandb/README.md

Lines changed: 30 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -8,19 +8,14 @@ This example also highlights the Weights and Biases streaming capability from th
88

99
> **_NOTE:_** This example uses the [CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html) dataset and will load its data within the trainer code.
1010
11-
### 1. Install requirements and configure PYTHONPATH
11+
### 1. Install requirements
1212

1313
Install additional requirements (if you already have a specific version of nvflare installed in your environment, you may want to remove nvflare in the requirements to avoid reinstalling nvflare):
1414

1515
```
1616
python -m pip install -r requirements.txt
1717
```
1818

19-
Set `PYTHONPATH` to include custom files of this example:
20-
```
21-
export PYTHONPATH=${PWD}/..
22-
```
23-
2419
### 2. Make sure the FL server is logged into Weights and Biases
2520

2621
Import the W&B Python SDK and log in:
@@ -35,42 +30,51 @@ Provide your API key when prompted.
3530

3631
### 3. Run the experiment
3732

38-
Use nvflare simulator to run the example:
33+
Use job api to run the example:
3934

4035
```
41-
nvflare simulator -w /tmp/nvflare/ -n 2 -t 2 ./jobs/hello-pt-wandb
36+
python wandb_job.py
4237
```
4338

44-
### 3. Access the logs and results
39+
### 4. Access the logs and results
4540

4641
By default, Weights and Biases will create a directory named "wandb" in the server workspace. With "mode": "online" in the WandBReceiver, the
4742
files will be synced with the Weights and Biases server. You can visit https://wandb.ai/ and log in to see your run data.
4843

49-
### 4. Weights and Biases tracking
44+
### 5. How it works
5045

51-
For the job `hello-pt-wandb`, on the client side, the client code in `PTLearner` uses the syntax for Weights and Biases:
46+
To enable tracking with Weights & Biases (WandB), you can use the `WandBWriter` utility provided by NVFlare. Here's a basic example of how to integrate it into your training script:
5247

53-
```
54-
self.writer.log({"train_loss": cost.item()}, current_step)
48+
```python
49+
from nvflare.client.tracking import WandBWriter
50+
51+
wandb_writer = WandBWriter()
52+
wandb_writer.log({"train_loss": cost.item()}, current_step)
5553

56-
self.writer.log({"validation_accuracy": metric}, epoch)
5754
```
5855

59-
The `WandBWriter` mimics the syntax from Weights and Biases to send the information in events to the server through NVFlare events
60-
of type `analytix_log_stats` for the server to write the data for the WandB tracking server.
56+
The `WandBWriter` follows a syntax similar to the native WandB API, making it easy to adopt.
57+
58+
Internally, `WandBWriter` leverages the NVFlare client API to send metrics and trigger an `analytix_log_stats` event. This event can be received and processed by our `AnalyticsReceiver`, with the `WandBReceiver` being one implementation of it.
6159

62-
The `ConvertToFedEvent` widget turns the event `analytix_log_stats` into a fed event `fed.analytix_log_stats`,
63-
which will be delivered to the server side.
60+
In `wandb_job.py`, we configure the following components by default:
6461

65-
On the server side, the `WandBReceiver` is configured to process `fed.analytix_log_stats` events,
66-
which writes received data from these events.
62+
- The `ConvertToFedEvent` widget on the NVFlare client side, which transfroms the event `analytix_log_stats` into a fed event `fed.analytix_log_stats`. This enables the event to be sent from the NVFlare client to the NVFlare server.
6763

68-
This allows for the server to be the only party that needs to deal with authentication for the WandB tracking server, and the server
69-
can buffer the events from many clients to better manage the load of requests to the tracking server.
64+
- The `WandBReceiver` on the NVFlare server side, which listens for `fed.analytix_log_stats` events and forwards the metric data to the WandB tracking server.
7065

71-
### 5. Sends to WandB server directly from client side
66+
This setup ensures that the server handles all authentication with the WandB tracking server and buffers events from multiple clients, effectively managing the load of requests to the server.
67+
68+
### 6. Optional: Stream Metrics Directly from Clients
69+
70+
Alternatively, you can stream metrics **directly from each NVFlare client** to WandB, bypassing the NVFlare server entirely.
71+
72+
To enable this mode, run your training script with the following flags:
73+
74+
```bash
75+
python wandb_job.py --streamed_to_clients --no-streamed_to_server
76+
```
7277

73-
You can stream the metrics to the MLFlow server without passing through the NVFlare server as well.
74-
Please check the job `hello-pt-wandb-client`.
78+
In this configuration, the `WandBReceiver` is set up on the NVFlare client side to process the `analytix_log_stats` event.
7579

76-
You notice we configure the `WandBReceiver` on the client side to process the `analytix_log_stats` event.
80+
As a result, each NVFlare client sends its metrics directly to its corresponding WandB server.

examples/advanced/experiment-tracking/wandb/jobs/hello-pt-wandb-client/app/config/config_fed_client.json

Lines changed: 0 additions & 65 deletions
This file was deleted.

examples/advanced/experiment-tracking/wandb/jobs/hello-pt-wandb-client/app/config/config_fed_server.json

Lines changed: 0 additions & 84 deletions
This file was deleted.

examples/advanced/experiment-tracking/wandb/jobs/hello-pt-wandb-client/meta.json

Lines changed: 0 additions & 10 deletions
This file was deleted.

examples/advanced/experiment-tracking/wandb/jobs/hello-pt-wandb/app/config/config_fed_client.json

Lines changed: 0 additions & 45 deletions
This file was deleted.

examples/advanced/experiment-tracking/wandb/jobs/hello-pt-wandb/app/config/config_fed_server.json

Lines changed: 0 additions & 84 deletions
This file was deleted.

examples/advanced/experiment-tracking/wandb/jobs/hello-pt-wandb/meta.json

Lines changed: 0 additions & 10 deletions
This file was deleted.

0 commit comments

Comments
 (0)