Skip to content

Commit ec66533

Browse files
CC provision with docker image workload (#3639)
CC provision with docker image workload for a generic way to accommodate different applications in the future ### Description CC provision with docker image workload for a generic way to accommodate different applications in the future ### Types of changes <!--- Put an `x` in all the boxes that apply, and remove the not applicable items --> - [x] Non-breaking change (fix or new feature that would not break existing functionality). - [ ] Breaking change (fix or new feature that would cause existing functionality to change). - [ ] New tests added to cover the changes. - [ ] Quick tests passed locally by running `./runtest.sh`. - [ ] In-line docstrings updated. - [ ] Documentation updated. --------- Co-authored-by: Chester Chen <[email protected]>
1 parent f61c601 commit ec66533

File tree

15 files changed

+343
-184
lines changed

15 files changed

+343
-184
lines changed
Lines changed: 83 additions & 56 deletions
Original file line numberDiff line numberDiff line change
@@ -1,85 +1,112 @@
11
# How to use CC provision
22

3-
In project.yml, under each site add "cc_config: [file]", for example:
3+
This guide explains how to use **CC (Confidential Computing) Provision** in NVFLARE, including setting up site configurations, enabling the CC builder, and using Docker images for CC workloads.
4+
5+
6+
## 0. Prepare application docker image workload
7+
8+
In CC, we don't allow custom code, all the codes and required libs must be built-in in the docker image.
9+
In this example, we show you how to build NVFlare docker images in [docker/](docker/README.md)
10+
11+
12+
## 1. Define CC Configuration per Site (`cc_config`)
13+
14+
Each site participating in a CC job must provide a **CC configuration file**. This file describes the trusted execution environment (e.g., AMD SEV-SNP on-prem CVM), drive allocations, and attestation policies.
15+
16+
Here is an example (`cc_server.yml`):
17+
18+
19+
```yaml
20+
compute_env: onprem_cvm
21+
cc_cpu_mechanism: amd_sev_snp
22+
role: server
23+
24+
# All drive sizes are in GB
25+
root_drive_size: 10
26+
applog_drive_size: 1
27+
user_data_drive_size: 1
28+
secure_drive_size: 10
29+
30+
# Docker image archive saved using:
31+
# docker save <image_name> | gzip > app.tar.gz
32+
docker_archive: /tmp/base_images/app.tar.gz
33+
34+
allowed_ports:
35+
- 8002
36+
37+
cc_issuers:
38+
- id: snp_authorizer
39+
path: nvflare.app_opt.confidential_computing.snp_authorizer.SNPAuthorizer
40+
token_expiration: 100 # seconds, needs to be less than check_frequency
41+
cc_attestation:
42+
check_frequency: 120 # seconds
43+
failure_action: stop_job
44+
```
45+
46+
## 2. Reference `cc_config` in `project.yml`
47+
48+
In your `project.yml`, reference the CC configuration file for each site using the `cc_config` key:
449

550
```yaml
651
participants:
7-
- name: site-1
8-
type: client
52+
- name: server1
53+
type: server
954
org: nvidia
10-
cc_config: cc_site-1.yml
55+
fed_learn_port: 8002
56+
cc_config: cc_server1.yml
1157
```
1258

13-
Then in the end of builders add:
59+
## 3. Add the CCBuilder
1460

15-
```
61+
At the end of the `builders` section in your `project.yml`, add the `CCBuilder`:
62+
63+
```yaml
1664
builders:
1765
- path: nvflare.lighter.cc_provision.impl.cc.CCBuilder
1866
```
1967

20-
Then use the following command to generate startup kits:
21-
22-
```bash
23-
nvflare provision -p project.yml
24-
```
68+
This builder sets up all CC-related configurations and assets.
2569

26-
# NVFlare application code package
70+
## 4. Add the OnPremPackager
2771

28-
For CC jobs, we don't allow custom codes, so we must pre-install those codes inside each CVM.
29-
We utilize our nvflare pre-install command to do that.
30-
31-
First, we need to prepare the application_code_zip folder structure:
72+
To generate startup kits for on-premises deployment, add the `OnPremPackager`:
3273

33-
```bash
34-
application_code_folder
35-
├── application/ # optional
36-
│ └── <job_name>/
37-
│ ├── meta.json # job metadata
38-
│ ├── app_<site>/ # Site custom code
39-
│ └── custom/ # Site custom code
40-
├── application-share/ # Shared resources
41-
| └── simple_network.py # Shared model definition
42-
└── requirements.txt # Python dependencies (optional)
74+
```yaml
75+
packager:
76+
path: nvflare.lighter.cc_provision.impl.onprem_packager.OnPremPackager
77+
args:
78+
build_image_cmd: build_cvm_image.sh
4379
```
4480

45-
We have already prepared application-share folder and requirements.txt in this example.
46-
We run the following command to create a zip folder so we can use that to build the CVM:
81+
Note:
82+
1. `build_image_cmd`: Path to the script used to build the CVM disk image.
83+
2. For 2.7.0 Technical Preview release, please contact `[email protected]` to receive the `build_cvm_image.sh`
4784

48-
```bash
49-
python -m zipfile -c application_code.zip application_code/*
50-
```
85+
## 5. Generate the Startup Kits
5186

52-
# Content inside CC configuration
87+
Once you add all the required sections into your `project.yml`, run the provision command:
5388

89+
```bash
90+
nvflare provision -p project.yml
5491
```
55-
compute_env: onprem_cvm
56-
cc_cpu_mechanism: amd_sev_snp
57-
role: server
5892

59-
# All drive sizes are in GB
60-
root_drive_size: 15
61-
secure_drive_size: 2
62-
data_source: /tmp/data
93+
## 6. Distribute and deploy
6394

64-
# Can be any pip-installable version string (e.g., "2.6.0", "latest", Git URL, etc.)
65-
nvflare_version: "2.6.0"
95+
Each site's result will be located in
6696

67-
# NVFlare application code package to be pre-installed inside the CVM
68-
nvflare_package: application_code.zip
69-
allowed_ports:
70-
- 8002
71-
trustee_host: trustee-azsnptpm.eastus.cloudapp.azure.com
72-
trustee_port: 8999
97+
```bash
98+
workspace/example_project/prod_xx/[site_name]/[site_name].tgz
99+
```
73100

74-
cc_issuers:
75-
- id: snp_authorizer
76-
path: nvflare.app_opt.confidential_computing.snp_authorizer.SNPAuthorizer
77-
token_expiration: 150 # in seconds, needs to be less than check_frequency
78-
- id: gpu_authorizer
79-
path: nvflare.app_opt.confidential_computing.gpu_authorizer.GPUAuthorizer
80-
token_expiration: 150 # in seconds, needs to be less than check_frequency
101+
You can distribute these tgz file to each site.
81102

82-
cc_attestation:
83-
check_frequency: 300 # in seconds
103+
To deploy on each site, do:
84104

105+
```bash
106+
tar -zxvf [site_name].tgz
107+
cd cvm_xxx
108+
./launch_vm.sh
85109
```
110+
111+
The confidential VM will start, and the NVFLARE server and clients will automatically connect and begin operation.
112+
You can now use the NVFlare admin console to communicate with the NVFlare system.

examples/advanced/cc_provision/cc_server1.yml

Lines changed: 15 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -3,24 +3,26 @@ cc_cpu_mechanism: amd_sev_snp
33
role: server
44

55
# All drive sizes are in GB
6-
root_drive_size: 15
7-
secure_drive_size: 2
8-
data_source: /tmp/data
6+
root_drive_size: 30
7+
applog_drive_size: 1
8+
user_config_drive_size: 1
9+
user_data_drive_size: 1
10+
# Docker image archive saved using:
11+
# docker save <image_name> | gzip > app.tar.gz
12+
docker_archive: /tmp/base_images/app.tar.gz
13+
# will be mount inside docker "/user_config/nvflare"
14+
user_config:
15+
nvflare: /tmp/startup_kits
916

10-
# Can be any pip-installable version string (e.g., "2.6.0", "latest", Git URL, etc.)
11-
nvflare_version: "2.6.0"
12-
13-
# NVFlare application code package to be pre-installed inside the CVM
14-
nvflare_package: application_code.zip
1517
allowed_ports:
16-
- 8002
17-
trustee_host: trustee-azsnptpm.eastus.cloudapp.azure.com
18-
trustee_port: 8999
18+
- 8002
1919

2020
cc_issuers:
2121
- id: snp_authorizer
2222
path: nvflare.app_opt.confidential_computing.snp_authorizer.SNPAuthorizer
23-
token_expiration: 150 # needs to be less than check_frequency
23+
token_expiration: 100 # seconds, needs to be less than check_frequency
24+
snpguest_binary: "/host/bin/snpguest"
2425

2526
cc_attestation:
26-
check_frequency: 300
27+
check_frequency: 120 # seconds
28+
failure_action: stop_job

examples/advanced/cc_provision/cc_site-1.yml

Lines changed: 18 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -4,27 +4,31 @@ cc_gpu_mechanism: nvidia_cc
44
role: client
55

66
# All drive sizes are in GB
7-
root_drive_size: 15
8-
secure_drive_size: 2
9-
data_source: /tmp/data
7+
root_drive_size: 30
8+
applog_drive_size: 1
9+
user_config_drive_size: 1
10+
user_data_drive_size: 1
11+
# Docker image archive saved using:
12+
# docker save <image_name> | gzip > app.tar.gz
13+
docker_archive: /tmp/base_images/app.tar.gz
1014

11-
# Can be any pip-installable version string (e.g., "2.6.0", "latest", Git URL, etc.)
12-
nvflare_version: "2.6.0"
15+
# for debugging purpose
16+
# hosts_entries:
17+
# server1: 1.2.3.4
1318

14-
# NVFlare application code package to be pre-installed inside the CVM
15-
nvflare_package: application_code.zip
16-
hosts_entries:
17-
server1: 1.2.3.4
18-
trustee_host: trustee-azsnptpm.eastus.cloudapp.azure.com
19-
trustee_port: 8999
19+
# will be mount inside docker "/user_config/nvflare"
20+
user_config:
21+
nvflare: /tmp/startup_kits
2022

2123
cc_issuers:
2224
- id: snp_authorizer
2325
path: nvflare.app_opt.confidential_computing.snp_authorizer.SNPAuthorizer
24-
token_expiration: 150 # needs to be less than check_frequency
26+
token_expiration: 100 # seconds, needs to be less than check_frequency
27+
snpguest_binary: "/host/bin/snpguest"
2528
- id: gpu_authorizer
2629
path: nvflare.app_opt.confidential_computing.gpu_authorizer.GPUAuthorizer
27-
token_expiration: 150 # needs to be less than check_frequency
30+
token_expiration: 100 # seconds, needs to be less than check_frequency
2831

2932
cc_attestation:
30-
check_frequency: 300
33+
check_frequency: 120 # seconds
34+
failure_action: stop_job
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
ARG BASE_IMAGE=python:3.12
2+
3+
FROM ${BASE_IMAGE}
4+
5+
ENV PYTHONDONTWRITEBYTECODE=1
6+
ENV PIP_NO_CACHE_DIR=1
7+
ENV PATH="/host/bin:${PATH}"
8+
9+
RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get -y install zip
10+
RUN pip install -U pip
11+
RUN pip install git+https://github.com/NVIDIA/NVFlare.git@main
12+
COPY application_code.zip application_code.zip
13+
RUN nvflare pre-install install -a application_code.zip
14+
15+
ENTRYPOINT ["/user_config/nvflare/startup/sub_start.sh", "--verify"]
16+
Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
This is a simple example to show how to build NVFlare docker image.
2+
3+
4+
# NVFlare application code package
5+
We can pre-install custom codes inside each docker image.
6+
We utilize our nvflare pre-install command to do that.
7+
8+
First, we need to prepare the `application_code.zip` folder structure:
9+
10+
```bash
11+
application_code_folder
12+
├── application/ # optional
13+
│ └── <job_name>/
14+
│ ├── meta.json # job metadata
15+
│ ├── app_<site>/ # Site custom code
16+
│ └── custom/ # Site custom code
17+
├── application-share/ # Shared resources
18+
| └── simple_network.py # Shared model definition
19+
└── requirements.txt # Python dependencies (optional)
20+
```
21+
22+
We have already prepared application-share folder and requirements.txt in this example.
23+
We run the following command to create a zip folder so we can use that to build the CVM:
24+
25+
```bash
26+
python -m zipfile -c application_code.zip application_code/*
27+
```
28+
29+
# NVFlare docker file
30+
31+
We have prepared a `Dockerfile`.
32+
Please run the following command to build the NVFlare docker image:
33+
34+
```bash
35+
./docker_build.sh Dockerfile nvflare-site
36+
```
Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
#!/bin/bash
2+
3+
set -euo pipefail
4+
5+
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
6+
DOCKERFILE=${1:-}
7+
IMAGE_TAG=${2:-}
8+
9+
# Validate inputs
10+
if [[ -z "$DOCKERFILE" || -z "$IMAGE_TAG" ]]; then
11+
echo "Error: Missing arguments."
12+
echo "Usage: $0 <Dockerfile> <image_tag>"
13+
echo "Example: $0 Dockerfile.site nvflare_site"
14+
exit 1
15+
fi
16+
17+
DOCKERFILE_PATH="$SCRIPT_DIR/$DOCKERFILE"
18+
19+
if [[ ! -f "$DOCKERFILE_PATH" ]]; then
20+
echo "Error: Dockerfile not found at $DOCKERFILE_PATH"
21+
exit 1
22+
fi
23+
24+
echo "Calling build_nvflare_docker.sh using Dockerfile $DOCKERFILE and $IMAGE_TAG"
25+
docker build -t "$IMAGE_TAG" -f "$SCRIPT_DIR/$DOCKERFILE" "$SCRIPT_DIR"
26+
docker save "$IMAGE_TAG" | gzip > "$SCRIPT_DIR/${IMAGE_TAG}.tar.gz"
27+
28+
if [[ $? -eq 0 ]]; then
29+
echo "Docker image successfully saved to: $SCRIPT_DIR/${IMAGE_TAG}.tar.gz"
30+
else
31+
echo "Failed to save Docker image"
32+
exit 1
33+
fi

examples/advanced/cc_provision/project.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,10 @@ builders:
5858
- path: nvflare.lighter.impl.cert.CertBuilder
5959
- path: nvflare.lighter.impl.signature.SignatureBuilder
6060
- path: nvflare.lighter.cc_provision.impl.cc.CCBuilder
61+
- path: nvflare.lighter.impl.docker_image_builder.DockerImageBuilder
62+
args:
63+
base_dockerfile: Dockerfile.base
64+
requirement: git+https://github.com/NVIDIA/NVFlare.git@main
6165
packager:
6266
path: nvflare.lighter.cc_provision.impl.onprem_packager.OnPremPackager
6367
args:

0 commit comments

Comments
 (0)