Shuting He · Peilin Ji · Yitong Yang · Changshuo Wang · Jiayi Ji · Yinglin Wang · Henghui Ding
💡 Welcome to the official repository of our survey paper.
📌 Please feel free to open issues or pull requests for any possibly missed wonderful work.
We present the first dedicated survey that systematically reviews downstream applications of 3DGS beyond classical view synthesis. Specifically, we focus on three rapidly evolving directions, namely segmentation, editing, and generation.
- ✂️ Existing Datasets for 3DGS Segmentation
- 🗂️ Existing Datasets for 3DGS Editing
- 🧩 Existing Datasets for 3DGS Generation
- 🛠️ Existing Methods for 3DGS Segmentation
- ✏️ Existing Methods for 3DGS Editing
- 🎨 Existing Methods for 3DGS Generation
- 🎖 Other Application Tasks
- ⛳ Related Survey
- 📢 Citation
This section summarizes commonly used datasets for segmentation tasks in 3D Gaussian Splatting.
| Datasets with URL | Venue | #Scenes | #Views | Highlight |
|---|---|---|---|---|
| ScanNet | CVPR'17 | 1513 | 1500 | Large-scale RGB-D scans with 3D poses and semantics for advanced scene understanding. |
| Replica | ArXiv'19 | 18 | 175 | High-quality indoor scans with geometry, HDR textures, and rich semantic labels. |
| NVOS | CVPR'21 | 8 | 36 | Built on LLFF with undistorted images, annotated with masks and scribbles for segmentation tasks. |
| Mip-NeRF 360 | CVPR'22 | 9 | 215 | Focusing on capturing complex lighting, geometry, and texture details. |
| SPIn-NeRF | CVPR'23 | 10 | 100 | Providing challenging real-world scenes with views both with and without a target object. |
| 3D-OVS | NeurIPS'23 | 10 | 30 | Including high-quality 3D objects spanning diverse categories with language-aligned semantic labels. |
| LERF-OVS | CVPR'24 | 4 | 200 | An extended version of LERF dataset with ground truth mask annotations for open-vocabulary segmentation. |
| LERF-Mask | ECCV'24 | 3 | 200 | Containing semantic annotations of three scenes from LERF dataset with a total of 23 prompts. |
| Ref-LERF | ICML'25 | 4 | 200 | Focusing on spatial relationships, annotated with natural language expressions for referring 3DGS segmentation. |
| SceneSplat-7K | ICCV'25 | 7k | - | The first large-scale, high-quality 3DGS dataset for indoor environments boosting scene understanding research. |
| SceneSplat-49K | ArXiv'25 | 49k | - | Containing diverse indoor and outdoor scenes, featuring complex, high-quality full scenes from multiple sources. |
This section introduces datasets suitable for 3D editing tasks.
| Datasets with URL | Venue | #Scenes | #Views | Highlight |
|---|---|---|---|---|
| DTU | CVPR'14 | 80 | 343 | Each scene consists of 49 or 64 accurate camera positions and reference structured light scans. |
| Tanks and Temples | TOG'17 | 14 | - | Includes individual objects (e.g., Tank, Train) and large indoor scenes (e.g., Auditorium, Museum). |
| GL3D | ACCV'18 | 543 | 230 | Contains 125,623 high-res images captured by drones from various environments with geometric overlap. |
| LLFF | TOG'19 | 32 | 25 | Uses COLMAP SfM to compute poses for real images. |
| BlendedMVS | CVPR'20 | 113 | 158 | A large-scale MVS dataset, which contains a total of 17,818 images. |
| NeRF-synthetic | ECCV'20 | 8 | 100 | Objects on white backgrounds with 800×800 images and camera poses. |
| Co3D | CVPR'21 | - | - | Consists of 1.5 million frames extracted from ~19,000 videos, covering 50 MS-COCO categories with camera poses and 3D point clouds. |
| Mip-NeRF360 | CVPR'22 | 9 | 215 | 360° panoramic images from indoor and outdoor environments. |
| SPIn-NeRF | CVPR'23 | 10 | 100 | Providing challenging real-world scenes with views both with and without a target object. |
| IN2N | ICCV'23 | 6 | 172 | Enabling structured and globally consistent 3D scene modifications while preserving the original scene's identity. |
| ScanNet++ | ICCV'23 | 460 | 608 | 280,000 captured DSLR images, and over 3.7M iPhone RGBD frames. |
| NeRFstudio | SIGGRAPH'23 | 10 | - | Includes 4 phone captures with pinhole lenses and 6 mirrorless camera captures with fisheye lenses. |
| 360-USIDdataset | ArXiv'25 | 7 | 300 | Includes 4 outdoor (Box, Cone, Lawn, Plant) and 3 indoor (Cookie, Sunflower, Dustpan) scenes. |
This section covers datasets used for 3DGS-based generation tasks.
| Datasets with URL | Venue | #Type | #Scenes | Highlight |
|---|---|---|---|---|
| NYUdepth | ECCV'12 | Image-to-3D | 464 | Contains 1449 RGBD images, capturing 464 diverse indoor scenes, with detailed annotations. |
| ShapeNet | ArXiv'15 | Image & Text-to-3D | 60K | These 3D models span 55 categories, each with a geometry file and unique identifier. |
| ScanNet | CVPR'17 | Image-to-3D | 1513 | Contains 2.5M views in 1513 indoor scenes annotated with 3D camera poses. |
| RealEstate10K | SIGGRAPH'18 | Image-to-3D | 80K | Comprises home walkthrough videos from YouTube. |
| Replica | ArXiv'19 | Image-to-3D | 18 | A 3D indoor scene dataset featuring dense meshes, HDR textures, and semantic labels. |
| ACID | ICCV'21 | Image-to-3D | 13,047 | Features aerial landscape videos, includes 11,075 training scenes and 1,972 testing scenes. |
| GSO | ICRA'22 | Image & Text-to-3D | 1030 | Comprises 3D scanned common household items. |
| LAION-5B | NeurIPS'22 | Text-to-3D | - | LAION-5B's key feature is its vast scale, with 5.85 billion image-text pairs. |
| Objaverse | CVPR'23 | Image & Text-to-3D | 800K | Objaverse has a vast scale of 800K+ 3D models with rich annotations. |
| OmniObject3D | CVPR'23 | Image-to-3D | 6K | A large-scale collection of high-quality real-scanned 3D objects with rich 2D and 3D annotations. |
| LOM | AAAI'24 | Image-to-3D | 5 | Includes 5 real-world scenes, each with 25–48 sRGB images captured in difficult lighting. |
| G-objaverse | ECCV'24 | Image & Text-to-3D | 280K | Contains 10 general classes totaling about 280K samples. |
| DL3DV-10K | CVPR'24 | Image-to-3D | 10K | Large-scale scene dataset that contains both indoor and outdoor scenarios. |
A summary of key segmentation approaches based on 3DGS.
Overview of methods that enable direct or indirect editing of 3DGS content.
Discusses generation methods that produce 3DGS representations from multimodal inputs like text and image.
| Year | Venue | Paper Abbr | Title | Project/Code |
|---|---|---|---|---|
| 2024 | ArXiv | 3DGS-DET | 3DGS-DET: Empower 3D Gaussian Splatting with Boundary Guidance and Box-Focused Sampling for 3D Object Detection | Code |
| 2024 | ICLR | Gaussian-Det | Gaussian-Det: Learning Closed-Surface Gaussians for 3D Object Detection | ❌ |
| 2025 | NeurIPS | - | 3D Gaussian Splatting based Scene-independent Relocalization with Unidirectional and Bidirectional Feature Fusion | ❌ |
| 2025 | ACM MM | SpatialReasoner | A Neural Representation Framework with LLM-Driven Spatial Reasoning for Open-Vocabulary 3D Visual Grounding | Code |
| 2025 | IROS | MATT-GS | MATT-GS: Masked Attention-based 3DGS for Robot Perception and Object Detection | ❌ |
A Survey on 3D Gaussian Splatting
3D Gaussian Splatting as a New Era: A Survey
Recent Advances in 3D Gaussian Splatting
Gaussian Splatting: 3D Reconstruction and Novel View Synthesis, a Review
3D Gaussian Splatting: Survey, Technologies, Challenges, and Opportunities
3DGS.zip: A survey on 3D Gaussian Splatting Compression Methods
3D Gaussian Splatting in Robotics: A Survey
Compression in 3D Gaussian Splatting: A Survey of Methods, Trends, and Future Directions
If you find this survey helpful, please consider citing it in your work. Thank you for your support!
@article{he2025survey,
title={A Survey on 3D Gaussian Splatting in Segmentation, Editing and Generation},
author={He, Shuting and Ji, Peilin and Yang, Yitong and Wang, Changshuo and Ji, Jiayi and Wang, Yinglin and Ding, Henghui},
journal={arXiv preprint arXiv:2508.09977},
year={2025}
}
