Skip to content

Commit 818eb8b

Browse files
Update CC docs [skip ci] (#3756)
Update CC docs ### Description Update CC docs ### Types of changes <!--- Put an `x` in all the boxes that apply, and remove the not applicable items --> - [x] Non-breaking change (fix or new feature that would not break existing functionality). - [ ] Breaking change (fix or new feature that would cause existing functionality to change). - [ ] New tests added to cover the changes. - [ ] Quick tests passed locally by running `./runtest.sh`. - [ ] In-line docstrings updated. - [ ] Documentation updated. --------- Co-authored-by: Chester Chen <[email protected]>
1 parent 5d64c63 commit 818eb8b

File tree

3 files changed

+198
-144
lines changed

3 files changed

+198
-144
lines changed

docs/user_guide/confidential_computing/attestation.rst

Lines changed: 56 additions & 87 deletions
Original file line numberDiff line numberDiff line change
@@ -4,99 +4,68 @@
44
Confidential Computing: Attestation Service Integration
55
#######################################################
66

7-
Data used in NVFlare is encrypted during transmission between participants, which covers the communication between the NVFlare server, clients, and admin. This security measure ensures
8-
data in transit is well protected. Users can also utilize existing infrastructure, such as storage encryption, to protect data at rest. With confidential computing, NVFlare can protect data in use
9-
and thus completes securing the entire lifecycle of data.
10-
11-
Confidential computing in NVFlare is designed to explicitly establish the trust between participants. Each participant must first capture the evidence related to the hardware (such as a GPU), the software (GPU driver and VBIOS), and other components in its own platform. The evidence will
12-
be validated and signed to ensure its validity and authenticity. The owner of signed evidence, called confidential computing token (CC token), can demonstrate the information about its computing environment to other
13-
participants by providing the CC token. Upon receiving the CC token, the participant (the relying party) can verify the claims inside the CC token against its own security policy to determine whether the CC token owner is
14-
using the required hardware, software, and components for security. If the relying party finds the CC token does not meet its security policy, the relying party can inform the system that it chooses not to join the job deployment
15-
and will not exchange models with others. Only participants who trust and are trusted by one another will work together to run the NVFlare job.
16-
17-
18-
**********************
19-
Configuring CC Manager
20-
**********************
21-
22-
In order to enable confidential computing in NVFlare, users need to include the CC manager, as a component, inside the resources.json file of startup kit local folder. The entire NVFlare system must
23-
be configured with the CC manager for either all participants or no participants.
24-
25-
The CC manager component depends on `NVIDIA Attestation SDK <https://github.com/NVIDIA/nvtrust/tree/main/guest_tools/attestation_sdk>`_. Users have to install it as a prerequisite. This SDK also
26-
depends on other software stacks, such as GPU verifier, driver and others.
27-
28-
The following is the sample configuration of CC manager.
29-
30-
.. code-block:: json
31-
32-
{
33-
"id": "cc_manager",
34-
"path": "nvflare.app_opt.confidential_computing.cc_manager.CCManager",
35-
"args": {
36-
"verifiers": [{"devices": "gpu", "env": "local", "url":"", "appraisal_policy_file":"evidence.plc","result_policy_file":"result.plc"}]
37-
}
38-
},
39-
40-
41-
The ``id`` is used internally by NVFlare so that other components can get its instance. The ``path`` is the complete Python module hierarchy.
42-
The ``args`` contains only the verifiers, a list of possible verifiers. Each verifier is a dictionary and its keys are "devices", "env",
43-
"url", "appraisal_policy_file" and "result_policy_file."
44-
45-
46-
The value of devices is either "gpu" or "cpu" for current Attestation SDK. The value of env is either "local" or "test" for the current Attestation SDK.
47-
Currently, valid combination is gpu and local or cpu and test. The value of url must be an empty string.
48-
The appraisal_policy_file and result_policy_file must point to an existing file. The former is currently ignored by Attestation SDK.
49-
The latter currently supports the following content only
50-
51-
.. code-block:: json
52-
53-
{
54-
"version":"1.0",
55-
"authorization-rules":{
56-
"x-nv-gpu-available":true,
57-
"x-nv-gpu-attestation-report-available":true,
58-
"x-nv-gpu-info-fetched":true,
59-
"x-nv-gpu-arch-check":true,
60-
"x-nv-gpu-root-cert-available":true,
61-
"x-nv-gpu-cert-chain-verified":true,
62-
"x-nv-gpu-ocsp-cert-chain-verified":true,
63-
"x-nv-gpu-ocsp-signature-verified":true,
64-
"x-nv-gpu-cert-ocsp-nonce-match":true,
65-
"x-nv-gpu-cert-check-complete":true,
66-
"x-nv-gpu-measurement-available":true,
67-
"x-nv-gpu-attestation-report-parsed":true,
68-
"x-nv-gpu-nonce-match":true,
69-
"x-nv-gpu-attestation-report-driver-version-match":true,
70-
"x-nv-gpu-attestation-report-vbios-version-match":true,
71-
"x-nv-gpu-attestation-report-verified":true,
72-
"x-nv-gpu-driver-rim-schema-fetched":true,
73-
"x-nv-gpu-driver-rim-schema-validated":true,
74-
"x-nv-gpu-driver-rim-cert-extracted":true,
75-
"x-nv-gpu-driver-rim-signature-verified":true,
76-
"x-nv-gpu-driver-rim-driver-measurements-available":true,
77-
"x-nv-gpu-driver-vbios-rim-fetched":true,
78-
"x-nv-gpu-vbios-rim-schema-validated":true,
79-
"x-nv-gpu-vbios-rim-cert-extracted":true,
80-
"x-nv-gpu-vbios-rim-signature-verified":true,
81-
"x-nv-gpu-vbios-rim-driver-measurements-available":true,
82-
"x-nv-gpu-vbios-index-conflict":true,
83-
"x-nv-gpu-measurements-match":true
84-
}
85-
}
7+
Please refer to the :ref:`NVFLARE CC Architecture <cc_architecture>`
8+
for the introduction and detailed architecture of the Confidential Computing.
869

10+
This document will introduce the cc attestation integration in NVFlare.
11+
12+
Each participant will use the corresponding CCAuthorizer to generate the CC token.
13+
14+
For example, the SNPAuthorizer utilizes the AMD's snpguest utility to generate
15+
an attestation report and package it into a CC token.
16+
17+
In NVFlare, the participant will first generate the CC token, then present its
18+
CC token to others to prove the integrity and trustworthiness of its environment.
19+
20+
Upon receiving a CC token, the participant verifies its claims against its own
21+
security policy. This check ensures that the token owner is using the required
22+
hardware, software, and configurations to meet the security standards.
23+
24+
If verification fails—i.e., the CC token does not meet the policy—the site
25+
may choose not to participate in the job. It will not exchange models or
26+
collaborate further.
27+
28+
This mechanism ensures that only mutually trusted participants take part in a
29+
federated learning job, reinforcing both security and integrity across the
30+
NVFlare system.
31+
32+
We provided a CCManager component and several CCAuthorizer components for different hardware platforms.
33+
Currently, we support the following CCAuthorizer components:
34+
- SNPAuthorizer
35+
- GPUAuthorizer
36+
- ACIAuthorizer
37+
- TDXAuthorizer
38+
39+
You can configure it using the provision step in the :ref:`NVFLARE CC Deployment Guide <cc_deployment_guide>`.
8740

8841
****************
8942
Runtime behavior
9043
****************
9144

92-
When one participant, either server or client, starts, the CC manager reacts to EventType.SYSTEM_BOOTSTRAP and retrieves its own CC token via Attestation SDK after the Attestation SDK successfully communicates
93-
with the software stacks and hardware. This CC token will be stored locally in CC manager.
45+
When a participant—either the server or a client—starts up, the CCManager
46+
responds to the EventType.SYSTEM_BOOTSTRAP event by generating its own
47+
CC token using the configured CCAuthorizers.
48+
49+
When a client registers with the server, it includes its CC token as part
50+
of the registration data. If the registration is successful, the server
51+
collects and stores the client's CC token.
52+
53+
The server's CCManager maintains both its own CC token and the tokens of all
54+
registered clients.
55+
56+
Once a job is submitted and scheduled for deployment, the server verifies the
57+
CC tokens of the clients listed in the job's deployment map, using its own
58+
result policy.
59+
60+
If all client tokens in the deployment map pass verification, the server sends
61+
the verified tokens to those clients for peer verification.
9462

95-
When the client registers itself with the server, it also includes its CC token in the registration data. Server will collect the client's CC token if it successfully registers. The server CC manager keeps
96-
all client's CC tokens as well as its own token.
63+
Each client then evaluates the received CC tokens against its own result policy
64+
to decide whether it trusts the other participants. Based on this evaluation,
65+
the client may choose to accept or reject participation in the job.
9766

98-
After a submitted job is scheduled to be deployed, the server verifies the CC tokens of clients that are included in the deployment map based on its result policy. If server finds
99-
all tokens from clients in the deployment map are verified successfully, those tokens will be sent to clients in deployment map for client side verification. The client can determine whether it
100-
wants to join this job or not based on the result of verifying others' CC tokens against its own result policy. If one client decides not to join the job, server will not deploy that job to that client.
67+
If a client declines to join the job, the server excludes it from deployment.
10168

102-
The server job scheduler will determine if the job has enough resources to be deployed and will determine the job's final status based on resource availability and retry policy.
69+
Finally, the server's job scheduler determines whether the job has sufficient
70+
resources to proceed. It finalizes the job's status based on resource
71+
availability and any defined retry policies.

0 commit comments

Comments
 (0)