|
4 | 4 | Confidential Computing: Attestation Service Integration |
5 | 5 | ####################################################### |
6 | 6 |
|
7 | | -Data used in NVFlare is encrypted during transmission between participants, which covers the communication between the NVFlare server, clients, and admin. This security measure ensures |
8 | | -data in transit is well protected. Users can also utilize existing infrastructure, such as storage encryption, to protect data at rest. With confidential computing, NVFlare can protect data in use |
9 | | -and thus completes securing the entire lifecycle of data. |
10 | | - |
11 | | -Confidential computing in NVFlare is designed to explicitly establish the trust between participants. Each participant must first capture the evidence related to the hardware (such as a GPU), the software (GPU driver and VBIOS), and other components in its own platform. The evidence will |
12 | | -be validated and signed to ensure its validity and authenticity. The owner of signed evidence, called confidential computing token (CC token), can demonstrate the information about its computing environment to other |
13 | | -participants by providing the CC token. Upon receiving the CC token, the participant (the relying party) can verify the claims inside the CC token against its own security policy to determine whether the CC token owner is |
14 | | -using the required hardware, software, and components for security. If the relying party finds the CC token does not meet its security policy, the relying party can inform the system that it chooses not to join the job deployment |
15 | | -and will not exchange models with others. Only participants who trust and are trusted by one another will work together to run the NVFlare job. |
16 | | - |
17 | | - |
18 | | -********************** |
19 | | -Configuring CC Manager |
20 | | -********************** |
21 | | - |
22 | | -In order to enable confidential computing in NVFlare, users need to include the CC manager, as a component, inside the resources.json file of startup kit local folder. The entire NVFlare system must |
23 | | -be configured with the CC manager for either all participants or no participants. |
24 | | - |
25 | | -The CC manager component depends on `NVIDIA Attestation SDK <https://github.com/NVIDIA/nvtrust/tree/main/guest_tools/attestation_sdk>`_. Users have to install it as a prerequisite. This SDK also |
26 | | -depends on other software stacks, such as GPU verifier, driver and others. |
27 | | - |
28 | | -The following is the sample configuration of CC manager. |
29 | | - |
30 | | -.. code-block:: json |
31 | | -
|
32 | | - { |
33 | | - "id": "cc_manager", |
34 | | - "path": "nvflare.app_opt.confidential_computing.cc_manager.CCManager", |
35 | | - "args": { |
36 | | - "verifiers": [{"devices": "gpu", "env": "local", "url":"", "appraisal_policy_file":"evidence.plc","result_policy_file":"result.plc"}] |
37 | | - } |
38 | | - }, |
39 | | -
|
40 | | -
|
41 | | -The ``id`` is used internally by NVFlare so that other components can get its instance. The ``path`` is the complete Python module hierarchy. |
42 | | -The ``args`` contains only the verifiers, a list of possible verifiers. Each verifier is a dictionary and its keys are "devices", "env", |
43 | | -"url", "appraisal_policy_file" and "result_policy_file." |
44 | | - |
45 | | - |
46 | | -The value of devices is either "gpu" or "cpu" for current Attestation SDK. The value of env is either "local" or "test" for the current Attestation SDK. |
47 | | -Currently, valid combination is gpu and local or cpu and test. The value of url must be an empty string. |
48 | | -The appraisal_policy_file and result_policy_file must point to an existing file. The former is currently ignored by Attestation SDK. |
49 | | -The latter currently supports the following content only |
50 | | - |
51 | | -.. code-block:: json |
52 | | -
|
53 | | - { |
54 | | - "version":"1.0", |
55 | | - "authorization-rules":{ |
56 | | - "x-nv-gpu-available":true, |
57 | | - "x-nv-gpu-attestation-report-available":true, |
58 | | - "x-nv-gpu-info-fetched":true, |
59 | | - "x-nv-gpu-arch-check":true, |
60 | | - "x-nv-gpu-root-cert-available":true, |
61 | | - "x-nv-gpu-cert-chain-verified":true, |
62 | | - "x-nv-gpu-ocsp-cert-chain-verified":true, |
63 | | - "x-nv-gpu-ocsp-signature-verified":true, |
64 | | - "x-nv-gpu-cert-ocsp-nonce-match":true, |
65 | | - "x-nv-gpu-cert-check-complete":true, |
66 | | - "x-nv-gpu-measurement-available":true, |
67 | | - "x-nv-gpu-attestation-report-parsed":true, |
68 | | - "x-nv-gpu-nonce-match":true, |
69 | | - "x-nv-gpu-attestation-report-driver-version-match":true, |
70 | | - "x-nv-gpu-attestation-report-vbios-version-match":true, |
71 | | - "x-nv-gpu-attestation-report-verified":true, |
72 | | - "x-nv-gpu-driver-rim-schema-fetched":true, |
73 | | - "x-nv-gpu-driver-rim-schema-validated":true, |
74 | | - "x-nv-gpu-driver-rim-cert-extracted":true, |
75 | | - "x-nv-gpu-driver-rim-signature-verified":true, |
76 | | - "x-nv-gpu-driver-rim-driver-measurements-available":true, |
77 | | - "x-nv-gpu-driver-vbios-rim-fetched":true, |
78 | | - "x-nv-gpu-vbios-rim-schema-validated":true, |
79 | | - "x-nv-gpu-vbios-rim-cert-extracted":true, |
80 | | - "x-nv-gpu-vbios-rim-signature-verified":true, |
81 | | - "x-nv-gpu-vbios-rim-driver-measurements-available":true, |
82 | | - "x-nv-gpu-vbios-index-conflict":true, |
83 | | - "x-nv-gpu-measurements-match":true |
84 | | - } |
85 | | - } |
| 7 | +Please refer to the :ref:`NVFLARE CC Architecture <cc_architecture>` |
| 8 | +for the introduction and detailed architecture of the Confidential Computing. |
86 | 9 |
|
| 10 | +This document will introduce the cc attestation integration in NVFlare. |
| 11 | + |
| 12 | +Each participant will use the corresponding CCAuthorizer to generate the CC token. |
| 13 | + |
| 14 | +For example, the SNPAuthorizer utilizes the AMD's snpguest utility to generate |
| 15 | +an attestation report and package it into a CC token. |
| 16 | + |
| 17 | +In NVFlare, the participant will first generate the CC token, then present its |
| 18 | +CC token to others to prove the integrity and trustworthiness of its environment. |
| 19 | + |
| 20 | +Upon receiving a CC token, the participant verifies its claims against its own |
| 21 | +security policy. This check ensures that the token owner is using the required |
| 22 | +hardware, software, and configurations to meet the security standards. |
| 23 | + |
| 24 | +If verification fails—i.e., the CC token does not meet the policy—the site |
| 25 | +may choose not to participate in the job. It will not exchange models or |
| 26 | +collaborate further. |
| 27 | + |
| 28 | +This mechanism ensures that only mutually trusted participants take part in a |
| 29 | +federated learning job, reinforcing both security and integrity across the |
| 30 | +NVFlare system. |
| 31 | + |
| 32 | +We provided a CCManager component and several CCAuthorizer components for different hardware platforms. |
| 33 | +Currently, we support the following CCAuthorizer components: |
| 34 | +- SNPAuthorizer |
| 35 | +- GPUAuthorizer |
| 36 | +- ACIAuthorizer |
| 37 | +- TDXAuthorizer |
| 38 | + |
| 39 | +You can configure it using the provision step in the :ref:`NVFLARE CC Deployment Guide <cc_deployment_guide>`. |
87 | 40 |
|
88 | 41 | **************** |
89 | 42 | Runtime behavior |
90 | 43 | **************** |
91 | 44 |
|
92 | | -When one participant, either server or client, starts, the CC manager reacts to EventType.SYSTEM_BOOTSTRAP and retrieves its own CC token via Attestation SDK after the Attestation SDK successfully communicates |
93 | | -with the software stacks and hardware. This CC token will be stored locally in CC manager. |
| 45 | +When a participant—either the server or a client—starts up, the CCManager |
| 46 | +responds to the EventType.SYSTEM_BOOTSTRAP event by generating its own |
| 47 | +CC token using the configured CCAuthorizers. |
| 48 | + |
| 49 | +When a client registers with the server, it includes its CC token as part |
| 50 | +of the registration data. If the registration is successful, the server |
| 51 | +collects and stores the client's CC token. |
| 52 | + |
| 53 | +The server's CCManager maintains both its own CC token and the tokens of all |
| 54 | +registered clients. |
| 55 | + |
| 56 | +Once a job is submitted and scheduled for deployment, the server verifies the |
| 57 | +CC tokens of the clients listed in the job's deployment map, using its own |
| 58 | +result policy. |
| 59 | + |
| 60 | +If all client tokens in the deployment map pass verification, the server sends |
| 61 | +the verified tokens to those clients for peer verification. |
94 | 62 |
|
95 | | -When the client registers itself with the server, it also includes its CC token in the registration data. Server will collect the client's CC token if it successfully registers. The server CC manager keeps |
96 | | -all client's CC tokens as well as its own token. |
| 63 | +Each client then evaluates the received CC tokens against its own result policy |
| 64 | +to decide whether it trusts the other participants. Based on this evaluation, |
| 65 | +the client may choose to accept or reject participation in the job. |
97 | 66 |
|
98 | | -After a submitted job is scheduled to be deployed, the server verifies the CC tokens of clients that are included in the deployment map based on its result policy. If server finds |
99 | | - all tokens from clients in the deployment map are verified successfully, those tokens will be sent to clients in deployment map for client side verification. The client can determine whether it |
100 | | - wants to join this job or not based on the result of verifying others' CC tokens against its own result policy. If one client decides not to join the job, server will not deploy that job to that client. |
| 67 | +If a client declines to join the job, the server excludes it from deployment. |
101 | 68 |
|
102 | | -The server job scheduler will determine if the job has enough resources to be deployed and will determine the job's final status based on resource availability and retry policy. |
| 69 | +Finally, the server's job scheduler determines whether the job has sufficient |
| 70 | +resources to proceed. It finalizes the job's status based on resource |
| 71 | +availability and any defined retry policies. |
0 commit comments