DepR: Depth Guided Single-view Scene Reconstruction with Instance-level Diffusion

Qingcheng Zhao^*,1,† · Xiang Zhang^*,✉,2 · Haiyang Xu² · Zeyuan Chen² · Jianwen Xie³ · Yuan Gao⁴ · Zhuowen Tu²

¹ShanghaiTech University · ²UC San Diego · ³Lambda, Inc. · ⁴Stanford University

ICCV 2025

^* equal contribution ^✉ corresponding author

^† Project done while Qingcheng Zhao interned at UC San Diego.

Project Page | Paper | arXiv

🛠️ Environment Setup

We provide a pre-built Docker image at zx1239856/depr based on PyTorch 2.7.1 and CUDA 12.6. You can also build the image locally:

docker build -f Dockerfile . -t depr

Alternatively, you can install dependencies based on commands listed in Dockerfile.

🗂️ Dataset Setup

Please download processed 3D-FRONT dataset from https://huggingface.co/datasets/zx1239856/DepR-3D-FRONT. Extract the downloaded files into datasets/front3d_pifu/data. The result folder structure should look like

data/
|-- metadata/             (Scene metadata)
|   |-- 0.jsonl
|   |-- ...
|-- pickled_data/         (Raw data processed by InstPIFu)
|   |-- test/
|       |-- rendertask3000.pkl
|       |-- ...
|-- sdf_layout/           (GT layouts)
|   |-- 10000.npy
|   |-- ...
|- 3D-FUTURE-watertight/  (GT meshes, required for evaluation)
|   |-- 0004ae9a-1d27-4dbd-8416-879e9de1de8d/
|       |-- raw_watertight.obj
|       |-- ...
|-- instpifu_mask/        (Instance masks provided by InstPIFu)
|-- panoptic/             (Panoptic segmentation maps we rendered)
|-- img/                  (Optional, can be extracted from pickled data)
|-- depth/depth_pro/      (Optional)
`-- grounded_sam/         (Optional)

Alternatively, you may generate depth / segmentation yourself based on instructions below.

Generate Segmentation

Please prepare Grounded SAM weights in checkpoint/grounded_sam.

grounded_sam/
|-- GroundingDINO_SwinB.py
|-- groundingdino_swinb_cogcoor.pth
|-- groundingdino_swint_ogc.pth
`-- sam_vit_h_4b8939.pth

python -m scripts.run_grounded_sam

Generate Depth

Please put Depth Pro weights in checkpoint/.

python -m scripts.run_depth_pro --output depth_pro

📊 Inference

Please download our weights from https://huggingface.co/zx1239856/DepR and put everything in the checkpoint folder.

🚀 Demo

We provide a demo.ipynb notebook for inference demo on real-world images.

Object-level Evaluation

You may change 8 to the actual number of GPUs as needed.

bash launch.sh 8 all

(Optional) Guided Sampling

bash launch.sh 8 all --guided

Scene-level Evaluation

# Generate shapes
bash launch.sh 8 sample --metadata datasets/front3d_pifu/meta/test_scene.jsonl --use-sam

# Layout optim
bash launch.sh 8 scene --use-sam

# Prepare GT scene
python -m scripts.build_gt --out-dir output/gt

# Calculate scene-level CD/F1
accelerate launch --num_processes=8 --multi_gpu -m scripts.eval_scene --gt-pcd-dir output/gt/pcds --pred-dir output/infer/sam_3dproj_attn_dino_c9_augdep_augmask_nocfg_model_0074999/ --save-dir output/evaluation/results --method depr

🏷️ License

This repository is released under the CC-BY-SA 4.0 license.

🙏 Acknowledgement

Our framework utilizes pre-trained models including Grounded-Segment-Anything, Depth Pro, and DINO v2.

Our code is built upon diffusers, Uni-3D, and BlockFusion.

We use physically based renderings of 3D-FRONT scenes provided by InstPIFu. Additionally, we rendered panoptic segmentation maps ourselves.

We thank all these authors for their nicely open sourced code/datasets and their great contributions to the community.

📝 Citation

If you find our work useful, please consider citing:

@InProceedings{Zhao_2025_ICCV_DepR,
    author    = {Zhao, Qingcheng and Zhang, Xiang and Xu, Haiyang and Chen, Zeyuan and Xie, Jianwen and Gao, Yuan and Tu, Zhuowen},
    title     = {DepR: Depth Guided Single-view Scene Reconstruction with Instance-level Diffusion},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2025},
    pages     = {5722-5733}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.vscode		.vscode
configs/unet_c10		configs/unet_c10
datasets/front3d_pifu/meta		datasets/front3d_pifu/meta
demo		demo
depr		depr
figures		figures
scripts		scripts
.devcontainer.json		.devcontainer.json
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
demo.ipynb		demo.ipynb
launch.sh		launch.sh
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DepR: Depth Guided Single-view Scene Reconstruction with Instance-level Diffusion

Project Page | Paper | arXiv

🛠️ Environment Setup

🗂️ Dataset Setup

📊 Inference

🚀 Demo

🏷️ License

🙏 Acknowledgement

📝 Citation

About

Uh oh!

Languages

License

mlpc-ucsd/DepR

Folders and files

Latest commit

History

Repository files navigation

DepR: Depth Guided Single-view Scene Reconstruction with Instance-level Diffusion

Project Page | Paper | arXiv

🛠️ Environment Setup

🗂️ Dataset Setup

📊 Inference

🚀 Demo

🏷️ License

🙏 Acknowledgement

📝 Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages