Skip to content

XingweiDeng/PMCC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Prompt-affinity-Multi-modal-Class-Centroids-for-Unsupervised-Domain-Adaption


Highlights

Architecture

Abstract: In recent years, the advancements in large vision-language models (VLMs) like CLIP have sparked a renewed interest in leveraging the prompt learning mechanism to preserve semantic consistency between source and target domains in unsupervised domain adaption (UDA). While these approaches show promising results, they encounter fundamental limitations when quantifying the similarity between source and target domain data, primarily stemming from the redundant and modality-missing class centroids. To address these limitations, we propose Prompt-affinity Multi-modal Class Centroids for UDA (termed as PMCC). Firstly, we fuse the text class centroids (directly generated from the text encoder of CLIP with manual prompts for each class) and image class centroids (generated from the image encoder of CLIP for each class based on source domain images) to yield the multi-modal class centroids. Secondly, we conduct the cross-attention operation between each source or target domain image and these multi-modal class centroids. In this way, these class centroids that contain rich semantic information of each class will serve as a bridge to effectively measure the semantic similarity between different domains. Finally, we design a logit bias head and employ a multi-modal prompt learning mechanism to accurately predict the true class of each image for both source and target domains. We conduct extensive experiments on 4 popular UDA datasets including Office-31, Office-Home, VisDA-2017, and DomainNet. The experimental results validate our PMCC achieves higher performance with lower model complexity than the state-of-the-art (SOTA) UDA methods. The code of this project is available at GitHub: https://github.com/246dxw/PMCC.

Main Contributions

  • New perspective: To the best of our knowledge, this is the first attempt to leverage both the visual and textual semantic information of each class from VLMs to preserve the semantic consistency between source and target domains in UDA.
  • Novel method: We introduce a novel UDA method PMCC by constructing multi-modal class centroids that serve as a semantic bridge to effectively measure the similarity between data across domains with the multi-modal prompt learning mechanism.
  • High Performance: Extensive experiments on 4 popular UDA datasets (Office-31, Office-Home, VisDA-2017 and DomainNet) demonstrate that PMCC achieves higher performance with lower model complexity than the state-of-the-art UDA methods. Take Office-Home for example, PMCC yields higher average accuracy but its learnable parameters, training time and inference time only account for 8.8%, 73.6%, and 54.5% of the strongest baseline, respectively.

Results

PMCC in comparison with existing prompt tuning methods

Results reported below show accuracy across 3 UDA datasets with ViT-B/16 backbone. Our PMCC method adopts the paradigm of multi-modal prompt tuning.

Name Office-Home Acc. Office-31 Acc. VisDA-2017 Acc.
CLIP 82.1 77.5 88.9
CoOp 83.9 89.4 82.7
CoCoOp 84.1 88.9 84.2
VPT-deep 83.9 89.4 86.2
MaPLe 84.2 89.6 83.5
DAPL 84.4 81.2 89.5
PDA 85.7 91.2 89.7
PMCC(Ours) 86.0 91.8 90.1
Method (DomainNet Acc.) -> c -> i -> p -> q -> r -> s Avg
ViT 43.6 42.4 39.1 19.5 39.9 44.4 38.2
CLIP 70.1 46.4 61.7 13.7 82.9 62.6 56.2
DAMP (RN101) 69.7 51.0 67.5 14.7 82.5 61.5 57.8
PDA 74.3 49.6 69.9 14.9 84.4 66.0 59.9
DAPrompt 73.4 50.3 69.8 14.7 84.8 65.6 59.8
PMCC 74.7 49.8 70.1 15.4 84.2 66.5 60.1

Installation

For installation and other package requirements, please follow the instructions as follows. This codebase is tested on Ubuntu 18.04 LTS with python 3.7. Follow the below steps to create environment and install dependencies.

  • Setup conda environment.
# Create a conda environment
conda create -y -n pmcc python=3.7

# Activate the environment
conda activate pmcc

# Install torch (requires version >= 1.8.1) and torchvision
# Please refer to https://pytorch.org/get-started/previous-versions/ if your cuda version is different
conda install pytorch==1.12.0 torchvision==0.13.0 torchaudio==0.12.0 cudatoolkit=11.3 -c pytorch
  • Install dassl library.
# Instructions borrowed from https://github.com/KaiyangZhou/Dassl.pytorch#installation

# Clone this repo
git clone https://github.com/KaiyangZhou/Dassl.pytorch.git
cd Dassl.pytorch

# Install dependencies
pip install -r requirements.txt

# Install this library (no need to re-build if the source code is modified)
python setup.py develop
cd ..
  • Clone PMCC code repository and install requirements.
# Clone PMCC code base
git clone https://github.com/246dxw/PMCC.git
cd PMCC

# Install requirements
pip install -r requirements.txt

Data preparation

Please follow the instructions as follows to prepare all datasets. Datasets list:


Training and Evaluation

Please follow the instructions for training, evaluating and reproducing the results. Firstly, you need to modify the directory of data by yourself.

Training

# Example: trains on Office-Home dataset, and the source domian is art and the target domain is clipart (a-c)
bash scripts/pmcc/main_pmcc.sh officehome b32_ep10_officehome PMCC ViT-B/16 2 a-c 0

Evaluation

# evaluates on Office-Home dataset, and the source domian is art and the target domain is clipart (a-c)
bash scripts/pmcc/eval_pmcc.sh officehome b32_ep10_officehome PMCC ViT-B/16 2 a-c 0

The details are at each method folder in [scripts folder](PMCC/scripts at main · 246dxw/PMCC (github.com)).

Acknowledgements

Our style of reademe refers to PDA. And our code is based on CoOp and CoCoOp, DAPLMaPLe and PDA etc. repository. We thank the authors for releasing their code. If you use their model and code, please consider citing these works as well. Supported methods are as follows:

Method Paper Code
CoOp IJCV 2022 link
CoCoOp CVPR 2022 link
VPT ECCV 2022 link
IVLP & MaPLe CVPR 2023 link
DAPL TNNLS 2023 link
PDA AAAI 2024 link

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors