Skip to content

showlab/H2R-Grounder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

H2R-Grounder: A Paired-Data-Free Paradigm for Translating Human Interaction Videos into Physically Grounded Robot Videos

Hai Ci, Xiaokang Liu, Pei Yang, Yiren Song, Mike Zheng Shou*
Show Lab, National University of Singapore
*Corresponding author

📄 Paper (arXiv): coming soon
🌐 Project Page: https://showlab.github.io/H2R-Grounder/


⚡ TL;DR

H2R-Grounder converts third-person human interaction videos into frame-aligned robot manipulation videos — using no paired human–robot data for training.


📷 Method Overview

H2R-Grounder Pipeline

Figure: H2R-Grounder pipeline. We extract pose and background to form H2Rep, then use a diffusion-based in-context model to generate physically grounded robot videos aligned with human actions.


🎥 Qualitative Results

Visit our project page for full videos, comparisons, ablations, and failure case analysis:

👉 https://showlab.github.io/H2R-Grounder/


📦 Code & Models

Code and models will be released soon.


✏️ Citation

@article{ci2025h2rgrounder,
  title={H2R-Grounder: A Paired-Data-Free Paradigm for Translating Human Interaction Videos into Physically Grounded Robot Videos},
  author={Ci, Hai and Liu, Xiaokang and Yang, Pei and Song, Yiren and Shou, Mike Zheng},
  journal={arXiv preprint arXiv:XXXXX},
  year={2025}
}

About

A V2V framework that translates human interaction videos into robot manipulation videos.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published