Privatar is the first that leverages both local and untrusted cloud to concurrently achieve privacy-preserving multi-user avatar reconstruction. The entire post-split flow is illustrated as the figure below.
- multiface: the baseline avatar reconstruction framework.
- multiface_frequency_decompose: Decompose unwrapped texture into multiple frequency components -- Adopt BDCT filter with block=4, which gives 16 different frequency blocks, this remove 2 convolution layers, no offloading.
- multiface_partition_frequency_decompose: Decompose unwrapped texture into multiple frequency components AND horizontally partition all components into local and cloud. Adopt BDCT filter with block=4, which gives 16 different frequency blocks, this remove 2 convolution layers. The number of offloaded frequency components is controlled via
num_freq_comp_offloaded. - multiface_sparse: add sparsity to only decoder of the original VAE model.
- multiface_quantization: change the bitprecision of data into 8-/16-/32-bit integer for the decoder only.
- multiface_direct_split: directly split the model architecture into private and public branches.
We recommand using NVIDIA built-in docker.
Command 1: Download the docker. If you have Docker 19.03 or later, a typical command to launch the container is:
docker run --gpus all -it --rm nvcr.io/nvidia/pytorch:xx.xx-py3Note: our experiments base on nvcr.io/nvidia/pytorch:24.01-py3
Command 2: Download this repo to
git clone https://github.com/georgia-tech-synergy-lab/Privatar.gitCommand 3: Launch the docker which links to /work in docker.
docker run --gpus all -v <path>:/work -it --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 --memory 51200m --rm <docker_name>where <docker_name> refers to the name of your downloaded docker "nvcr.io/nvidia/pytorch:xx.xx-py3", and refers to the path to Privatar.
CRITICAL! All following scripts assuming the /work is the path to this repo in docker
Within in the Docker, install required dependency.
- Install OS-level dependencies
$ apt-get install mesa-common-dev libegl1-mesa-dev libgles2-mesa-dev
$ apt-get install mesa-utils
$ glxinfo | grep -i opengl- Install python packages
pip3 install torch
pip3 install Pillow ninja imageio imageio_ffmpeg six tensorboard opencv-python==4.8.0.74 wandb torchjpeg
pip3 install -U opencv-pythonNote:
- we use "wandb" to track the training, testing progress and record the final results.
By default, wandb is turned off, u could change
wandb_enablefrom each training/testing script to enable wandb. - If the GPU is 5090,
pip install -U --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu130should be used instead ofpip3 install torch.
- Install nvdiffrast package
git clone https://github.com/NVlabs/nvdiffrast
cd nvdiffrast
python3 setup.py installNote: if u are using RTX5090, please run the following patch after installing nvdiffrast to enable the feature.
source /work/experiment_scripts/nvdiffrast_patch.shcd /work
mkdir dataset
python3 ./multiface/download_dataset.py --dest "/work/dataset" --download_config "./mini_download_config.json"If u follow above instructions, then /work should be /work.
The pretrained weights for different users in the provided datasets are collected at this facial_pretrained_datasets. We use 6795937 base model as the evaluation target. Other structure would work as well. The link for the pretrained model weights of 6795937 is 6795937_base
cd /work
mkdir pretrain_model
cd pretain_model
wget https://fb-baas-f32eacb9-8abb-11eb-b2b8-4857dd089e15.s3.amazonaws.com/MugsyDataRelease/PretrainedModel/6795937--GHS-base_nosl/best_model.pth -O 6795937_best_model.pth- Original: multiface baseline using the pretrained model weights
cd /work
git clone https://github.com/facebookresearch/multiface.git
cd multiface
python3 launch_train_job_serial.py- Design Choice 1: directly split mesh and unwrapped texture into two separate path (offload entire unwrapped texture)
cd /work/multiface_direct_split
python3 launch_train_job_serial.py- Design Choice 2: quantize model to be low precision
cd /work/multiface_quantization
python3 launch_train_job_serial.py- Design Choice 3: prune channels from the decoder to reduce the local computation
cd /work/multiface_sparse
python3 launch_train_job_serial.py- Design Choice 4: decompose "unwrapped texture" into 16 frequency components, but keep all of them run local. This requires training to be completed.
cd /work/multiface_frequency_decompose
python3 launch_train_job_serial.py- Design Choice 5: decompose "unwrapped texture" into 16 frequency components, configurable components offloading. The number of frequency components are controlled by
num_freq_comp_offloaded. This requires training to be completed.
cd /work/multiface_partition_frequency_decompose
python3 launch_train_job_serial.pyNote: all training results locate at the /work/training_results folder.
Note that these test directly iterate the whole test dataset in random orders.
- Original: multiface baseline using the pretrained model weights
cd /work/multiface
python3 launch_test_job_serial.py- Design Choice 1: directly split mesh and unwrapped texture into two separate path (offload entire unwrapped texture)
cd /work/multiface_direct_split
python3 launch_test_job_serial.py- Design Choice 2: quantize model to be low precision
cd /work/multiface_quantization
python3 launch_test_job_serial.py- Design Choice 3: prune channels from the decoder to reduce the local computation
cd /work/multiface_sparse
python3 launch_test_job_serial.py- Design Choice 4: decompose "unwrapped texture" into 16 frequency components, but keep all of them run local. This requires training to be completed.
cd /work/multiface_frequency_decompose
python3 launch_test_job_serial.py- Design Choice 5: decompose "unwrapped texture" into 16 frequency components, configurable components offloading. The number of frequency components are controlled by
num_freq_comp_offloaded. This requires training to be completed.
cd /work/multiface_partition_frequency_decompose
python3 launch_test_job_serial.pyNote: all testing results locate at the /work/testing_results folder.
we also provide following scripts in case you only wanna see how the model performs for a specific set of images. Specifically, put the original path of the original images in the dataset into the file /work/experiment_scripts/render_scripts/test_image_path. And then run following commands.
The input images could be generated by run following script.
python3 /work/experiment_scripts/render_scripts/render_test_expression.pyThe ground truth of input images would be written to /work/render_results/ground_truth_input_testing_data.
- Original: multiface baseline using the pretrained model weights
cd /work/multiface
python3 launch_test_selected_expressions.py- Design Choice 1: directly split mesh and unwrapped texture into two separate path (offload entire unwrapped texture)
cd /work/multiface_direct_split
python3 launch_test_selected_expressions.py- Design Choice 2: quantize model to be low precision
cd /work/multiface_quantization
python3 launch_test_selected_expressions.py- Design Choice 3: prune channels from the decoder to reduce the local computation
cd /work/multiface_sparse
python3 launch_test_selected_expressions.py- Design Choice 4: decompose "unwrapped texture" into 16 frequency components, but keep all of them run local. This requires training to be completed.
cd /work/multiface_frequency_decompose
python3 launch_test_selected_expressions.py- Design Choice 5: decompose "unwrapped texture" into 16 frequency components, configurable components offloading. The number of frequency components are controlled by
num_freq_comp_offloaded. This requires training to be completed.
cd /work/multiface_partition_frequency_decompose
python3 launch_test_selected_expressions.pyNote that, we use /work/dataset/m--20180227--0000--6795937--GHS/images/E009_Smile_Mouth_Open/400023/002799.png (representing high dynamic expression) and /work/dataset/m--20180227--0000--6795937--GHS/images/E001_Neutral_Eyes_Open/400009/000102.png (representing low dynamic expression) to obtain the figure 1 in paper.
we use torch.jit.trace to optimize the kernel offloaded to GPU and pick which ever one between traced version and untraced version to give better performance.
- Original: multiface baseline using the pretrained model weights
cd /work/multiface
python3 latency_profiling_script.py- Design Choice 2: quantize model to be low precision
cd /work/multiface_quantization
python3 latency_profiling_script.pyNote: change bitwidth=<val> in model = decoder_linear_quantization(model, bitwidth=8, datatype=torch.int8) to change its precision.
- Design Choice 3: prune channels from the decoder to reduce the local computation
cd /work/multiface_sparse
python3 latency_profiling_script.py- Design Choice 4: decompose "unwrapped texture" into 16 frequency components, but keep all of them run local. This requires training to be completed.
cd /work/multiface_frequency_decompose
python3 latency_profiling_script.py- Design Choice 5: decompose "unwrapped texture" into 16 frequency components, configurable components offloading. The number of frequency components are controlled by
num_freq_comp_offloaded. This requires training to be completed.
cd /work/multiface_partition_frequency_decompose
python3 latency_profiling_script_local_path.py
python3 latency_profiling_script_offload_path.pylatency_profiling_script_local_path.py measures latency needed for local path, which runs locally on VR headset.
latency_profiling_script_offload_path.py measures latency needed for offloaded path, which runs on untrusted devices (a PC with GPU here).
To further understand the total amount of computation, we also offer a script to compute the Flops needed for both paths under different configurations.
python3 latency_flops_calculation.pyNote: multiface_direct_split is a design choice for understanding the information, hence we don't give latency profiling script for it.
Before calculating noise, two following steps need to be completed.
-
Complete Training: Train the model to generate the final weights required for testing.
-
Perform Testing: Use the trained model weights to evaluate. During testing, the latent codes are stored in the following directory:
./testing_results/<project_name>/latent_code/z_<id>.pth: Latent code for the local path.z_offload_<id>.pth: Latent code for the offloaded path (runs on the cloud). Note: thesave_latent_codecommand in the scriptlaunch_test_job_serial.pycontrols whether to explicitlys store latent code to the external drive.
Differential Privacy (DP) noise calculation follows paper. DP requires the prior knowledge of the L2 norm of all offloaded latent codes. Detailed procedures are written as the comments in the code.
cd /work/experiment_scripts/dp_analysis
python3 dp_noise_generation_for_multiface.pyThis dp_noise_generation_for_multiface.py calculates the amount of DP-based noises needed to protect information when using completed offload, i.e. the entire decoder of the original multiface is offloaded.
cd /work/experiment_scripts/dp_analysis
python3 dp_noise_generation_for_partition_multiface.pyThe dp_noise_generation_for_partition_multiface.py script conducts two partitioned design choice.
multiface_direct_split, which directly offloads the entire unwrapped texture. This is the default choice as it's baseline (completed offloaded + DP noise).multiface_partition_frequency_decompose, which offloads selected high frequency components of unwrapped texture.
We pre-pick posterious successful rate to be [0.98, 0.827, 0.4, 0.09, 0.035], means there potentially exists an attack who could mount attack with possibility of [98%, 82.7%, 40%, 9%, 3.5%]. This boils down to mutual information list as [4, 3, 1, 0.1, 0.01].
After this script, the generated noise will show up in the path
/work/experiment_scripts/dp_analysis/generated_dp_noise/dp_noise_partition_<num_offloaded_freq_components>_<mutual_information_bound>.npy
Noise calculation following PAC privacy requires the prior knowledge of the L2 norm of the covariance of all offloaded latent codes. So that PAC privacy could leverage the dimensional differences to generate non-uniform noise for minimizing the overall noise intensity.
python3 /work/experiment_scripts/pac_analysis/pac_noise_generation_for_partition_multiface.pyAfter this script, the generated noise will show up in the path
/work/experiment_scripts/pac_analysis/noise_covariance/pac_noise_partition_<num_offloaded_freq_components>_<mutual_information_bound>.npy
For generating PAC noise, we directly use the same mutual information as DP which gives the same provable privacy guarantee as DP noise generation.
After noise calculation, now we could start testing the actual accuracy and latency of noisy inference! Specifically, in horizontally partitioned avatar reconstruction flow, generated noise is only injected to the offloaded latent codes.
Noisy inference
cd /work/multiface_partition_frequency_decompose
python3 launch_noisy_test_job_serial.pyIn this script, the noise generated from Step 5 will be fed in as gaussian_noise_covariance_path to the model.
This generates detailed accuracy of noisy horizontal partitioned avatar reconstruction
under various partition configurations.
Beyond the proposed horizontal frequency based partitioned design choice, we also offer noisy inference code for directly split design case.
The attacker guesses the expression with the minimal difference through comparing the estimated frequency component to the reference expression, as shown by the Fig. 14 of the paper. Therefore, we need to first define the frequency components of reference expressions.
We provide three different ways of generating reference components, under all of which the attacker shows similar empirical successful rate.
- Method 1:
accumulate_channelmode, where only high frequency components are used as reference. - Method 2:
attack_from_high_frequency_channelmode, where only high frequency components are used as reference. - Method 3: when both modes are set as False, all 16 frequency components are sorted based on the amibuigity, and the frequency components coming with similar amount of ambuiguity will be merged as one
if model.normalize_list[freq_pair[0]] + model.normalize_list[freq_pair[1]] < 2:, where 2 is a random choice that could be changed into different values for obtaining different merging strategies.
Specify the configuration of accumulate_channel and attack_from_high_frequency_channel, then run the following script to launch attack.
python3 launch_empirical_attack.pyAfter reference frequency components are set, we run an empirical identification attack against a pretrained DeepAppearanceVAE_Partition model by matching predicted high-frequency texture components to precomputed components from each expression, and reports the identification accuracy.
The final accuracy will be printed out as attack_accuracy_mean <final_PSR>
Note that: using_pac_noise is the command to control using PAC privacy based noise or Differential Privacy based noise.
We also train a three-layer fully connected network which estimates the expression from the offloaded noisy latent code. Run following script to start the training.
The NN attacker randomly takes one sample from each expression, as detailed in /work/experiment_scripts/empirical_attack/selected_expression_frame_list.txt, which becomes its training dataset.
cd /work/multiface_partition_frequency_decompose
python3 launch_train_nn_attacker.pyAfter training, launch the attack via
cd /work/multiface_partition_frequency_decompose
python3 launch_test_nn_attacker.pyWe also offer few scripts for researchers, who are interested in exploring the model more deeply, to see different statistics and configurations.
/work/experiment_scripts/bdct_reconstruction/bdct_4x4_reconstruction_dataloader.ipynb includes the script to decomposes given unwrapped texture into different frequency components, and list few different configurations of decomposition.
For each configuration, all frequency components will be written to the folder /work/experiment_scripts/bdct_reconstruction for visualization.
To help understand the covariance of different frequency components under differnet datasets, we provide the script to perform running profiling of all decomposed frequency components under designated datasets.
To run it,
cd /work/multiface_frequency_decompose
python3 launch_l2norm_freq_cov_analysis.pyIt will shows results like following
trace of covariance = 11308.309006199575 for freq component = 0
trace of covariance = 199.41605765586263 for freq component = 1
trace of covariance = 77.53071010274161 for freq component = 2
trace of covariance = 41.89610293635083 for freq component = 3
trace of covariance = 152.33769320529836 for freq component = 4
trace of covariance = 33.72750777014488 for freq component = 5
trace of covariance = 26.97710811919577 for freq component = 6
trace of covariance = 19.355714181792997 for freq component = 7
trace of covariance = 38.930325425557726 for freq component = 8
trace of covariance = 25.547661178709628 for freq component = 9
trace of covariance = 23.260128349715654 for freq component = 10
trace of covariance = 18.580644217635967 for freq component = 11
trace of covariance = 23.134909910006407 for freq component = 12
trace of covariance = 16.6648382626215 for freq component = 13
trace of covariance = 16.267422970259233 for freq component = 14
trace of covariance = 12.812623106426202 for freq component = 15Note, custom_scripts contains scripts for development, maintainence and verification purpose.
To help creating visualized expression rendered from a given model configuration, we also offer script under each setup to render visual avatar prediction for specified input images.
Specifically, the set of images the model would render is detailed /work/experiment_scripts/render_scripts/test_image_path.
To launch the image rendering,
cd /work/<path_to_configuration>/
python3 launch_test_all_expressions_RTX3090.pywhere <path_to_configuration> could be multiface, multiface_frequency_decompose, multiface_quantization, multiface_direct_split, multiface_partition_frequency_decompose and multiface_sparse.
It will shows results into the folder /work/render_results/<configuration_name>
Have Fun! Enjoy! :D
