Skip to content

insdet/instance-detection

Repository files navigation

A High-Resolution Dataset for Instance Detection with Multi-View Instance Capture

Qianqian Shen1 · Yunhan Zhao2 · Nahyun Kwon3 · Jeeeun Kim3 · Yanan Li1 · Shu Kong3,4,5

1Zhejiang Lab 2UC Irvine 3Texas A&M University 4University of Macau 5Institute of Collaborative

Paper PDF Project Page Benchmark

The paper has been accepted by NeurIPS (Datasets and Benchmarks) 2023.

InsDet

Dataset

The InsDet datase is a high-resolution real-world dataset for Instance Detection with Multi-view Instance Capture.
We provide an InsDet-mini for demo and visualization, and the full dataset InsDet-FULL.

The full dataset contains 100 objects with multi-view profile images in 24 rotation positions (per 15°), 160 testing scene images with high-resolution, and 200 pure background images. The mini version contains 5 objects, 10 testing scene images, and 10 pure background images.

Details

The Objects contains:

  • 000_aveda_shampoo
    • images: raw RGB images (e.g., "images/001.jpg")
    • masks: segmentation masks generated by GrabCut Annotation Toolbox (e.g., "masks/001.png")
  • $\vdots$

  • 099_mug_blue

vis-objects

Tip: The first three digits specify the instance id.

The Scenes contains:

  • easy
    • leisure_zone
      • raw RGB images with 6144×8192 pixels (e.g. “office001/rgb_000.jpg”)
      • bounding box annotation for objects in test scenes generated by labelImg toolbox and using PascalVOC format (e.g. “office_001/rgb_000.xml”)
    • meeting_room
    • office_002
    • pantry_room_002
    • sink
  • hard
    • office_001
    • pantry_room_001

vis-scenes

Tip: Each bounding box is specified by [xmin, ymin, xmax, ymax].

The Background contains 200 pure background images that do not include any instances from Objects folder.

vis-background

Code

The project is built on detectron2, segment-anything, and DINOv2.

Demo

The Jupyter notebooks files demonstrate our non-learned method using SAM and DINOv2. We choose light pretrained models of SAM (vit_l) and DINOv2 (dinov2_vits14) for efficiency.

Citation

If you find our project useful, please consider citing:

@inproceedings{shen2023high,
        title={A high-resolution dataset for instance detection with multi-view object capture},
        author={Shen, Qianqian and Zhao, Yunhan and Kwon, Nahyun and Kim, Jeeeun and Li, Yanan and Kong, Shu},
        booktitle={NeurIPS Datasets & Benchmark Track},
        year={2023}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •