GitHub - naver/panst3r: PanSt3R: Multi-view Consistent Panoptic Segmentation (official code)

📖 Paper • 🎬 Overview • 🔖 Cite

Official implementation of PanSt3R: Multi-view Consistent Panoptic Segmentation. Presented at ICCV 2025.

PanSt3R: Panoptic 3D reconstruction examples

License

PanSt3R is released under the PanSt3R Non-Commercial License. See LICENSE and NOTICE for more information.
NOTICE also contains information about the datasets used to train the checkpoints.

Getting started

Installation

Setup tested on Python 3.11.

Clone repository

git clone https://github.com/naver/panst3r.git
cd panst3r

Install PyTorch (follow official instructions for your platform)

pip install torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --index-url https://download.pytorch.org/whl/cu126

(optional) Install xFormers (for memory-efficient attention):

pip install -U xformers==0.0.30 --index-url https://download.pytorch.org/whl/cu126 # Use appropriate wheel for your setup

Install PanSt3R with dependecies
```
pip install -e . 
```
(optional) Install cuRoPE extension Make sure you have appropriate CUDA versions installed and in your $PATH and $LD_LIBRARY_PATH. Then you can build the extension by:
```
pip install --no-build-isolation "git+https://github.com/naver/croco.git@croco_module#egg=curope&subdirectory=curope"
```

Running the demo

We include a Gradio demo for running PanSt3R inference. You can run it with the following command.

python gradio_panst3r.py --weights /path/to/model/weights

Optional arguments

--weights: Path to model weights for inference
--retrieval: Path to retrieval weights for open-vocabulary segmentation
--server_name: Specify the server URL (default: 127.0.0.1)
--server_port: Choose the port for the Gradio app (default: auto-select starting at 7860)
--viser_port: Choose the port for the embedded viser visualizer (default: auto-select starting at 5000)
--image_size: Sets input image size (select according to the model you use)
--encoder: Override config for the MUSt3R encoder
--decoder: Override config for the MUSt3R decoder
--camera_animation: Enable camera animation controls in the visualizer
--allow_local_files: Allow loading local files in the app
--device: PyTorch device to use (cuda, cpu, etc.; default: cuda)
--tmp_dir: Directory for temporary files
--silent / --quiet / -q: Suppress verbose output
--amp: Use Automatic Mixed Precision (bf16, fp16, or False; default: False)

Using the demo

Upload images
Select parameters (number of keyframes, prediction classes, and the postprocessing approach)
Click "Run"
After prediction and postprocessing is complete, the 3D reconstruction will appear in the "Visualizer" panel.

Note

Keyframes are selected at the start and processed together. Predictions for the remaining frames are performed individually, based on the memory features and decoded queries from the keyframes.

For typical use, set the number of keyframes equal to the number of input images. For large image collections (more than 50 images), reducing the number of keyframes helps manage memory usage.

Warning

The demo currently does not properly implement the support for multiple sessions.

Caution

Using --allow_local_files allows users to load images from a local path where the server is hosted. Use with caution.

Checkpoints

The following checkpoints are available for download. We include PQ scores obtained via direct multi-view prediction on the rendered test images (without LUDVIG).

Model	hypersim	replica	scannet	MD5 Checksum
PanSt3R_512_5ds	56.5	62.0	65.7	`c3836c108f1bf441fe53776e825cd1ac`

Training

We also provide an example of training (preprocessing, dataloaders, training code) on ScanNet++.

Preparing the data

We include a preprocessing script, based on the preprocessing script for MUSt3R. It prepares the related 3D targets (camera parameters, depthmaps) and panoptic masks. Depthmaps and panoptic masks are rendered using pyrender.

Download the ScanNet++V2 data
Prepare image pairs (e.g. random sampling with ensured overlap).
Precomputed pairs for ScanNet++V2 avaliable for download here.
Run the preprocessing script:

python tools/preprocess_scannetpp.py \
--root_dir /path/to/scannetppv2 \
--pairs_dir /path/to/pairs_dir \
--output_dir /path/to/output_dir

Warning

We use a fork of pyrender with an added shader for rendering instance masks (without anti-aliasing). If you followed the installation steps in a fresh environment is should be installed automatically. Manual installation is possible via pip install "git+https://github.com/lojzezust/pyrender.git".

Tip

Use --pyopengl-platform "egl" if running in headless mode (e.g. on a remote server).

Training

After the data is prepared, you can run training via the provided training script and configurations.

Note

Set paths to your preprocessed data and checkpoints in config/base.yaml.

If training from scratch provide a MUST3R checkpoint (can be found here).
If fine-tuning from PanSt3R, provide a path to the PanSt3R checkpoint.

# Limit CPU threading to prevent oversubscription in multi-GPU training
export MKL_NUM_THREADS=1
export NUMEXPR_NUM_THREADS=1
export OMP_NUM_THREADS=1

Single GPU training (or debug):

python train.py --config-name=base

For distributed training use torchrun:

# 2 GPUs on one node
torchrun --nnodes=1 --nproc_per_node=2 \
    train.py --config-name=distributed

Cite

@InProceedings{zust2025panst3r,
  title={PanSt3R: Multi-view Consistent Panoptic Segmentation},
  author={Zust, Lojze and Cabon, Yohann and Marrie, Juliette and Antsfeld, Leonid and Chidlovskii, Boris and Revaud, Jerome and Csurka, Gabriela},
  booktitle={International Conference on Computer Vision (ICCV)},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
configs		configs
src/panst3r		src/panst3r
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
gradio_panst3r.py		gradio_panst3r.py
pyproject.toml		pyproject.toml
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Table of Contents

License

Getting started

Installation

Running the demo

Checkpoints

Training

Preparing the data

Training

Cite

About

Uh oh!

Releases

Packages

Languages

License

naver/panst3r

Folders and files

Latest commit

History

Repository files navigation

Table of Contents

License

Getting started

Installation

Running the demo

Checkpoints

Training

Preparing the data

Training

Cite

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages