[TPAMI 2026] Confidence-aware Pseudo-label Self-Correction for Weakly Supervised Visual Grounding

Code will be available soon.

Weakly supervised visual grounding aims to locate a region in an image based on an input query sentence, without access to the mapping between image regions and queries during training. Current methods treat spatial grounding as an object retrieval task, relying on cross-modal similarity scores for proposal selection. However, they fail to address model overfitting caused by unreliable cross-modal similarity scores. To overcome this, we first propose the Confidence-aware Pseudo-label Learning (CPL) framework. CPL first generates diverse pseudo queries for region proposals, and then establishes reliable associations for model training based on the uni-modal similarity score. Secondly, we propose a cross-modal verification module based on the pretrained vision-language model to verify associations. However, the verification module is isolated from the grounding model, so it can only assess associations in a static manner, but not correct the suspicious ones. Finally, we introduce CPL++ to make two-fold improvements. For one thing, we upgrade the verification process based on the model's grounding loss value to identify suspicious associations dynamically and selectively leverage them in the training. For another, we propose a self-supervised association correction module to rectify suspicious associations, thereby mitigating the risk of error propagation. Experimental results on five datasets demonstrate the superiority of our approach.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
image.png		image.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[TPAMI 2026] Confidence-aware Pseudo-label Self-Correction for Weakly Supervised Visual Grounding

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

[TPAMI 2026] Confidence-aware Pseudo-label Self-Correction for Weakly Supervised Visual Grounding

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages