Repository providing the source code for the paper
CMRNext: Camera to LiDAR Matching in the Wild for Localization and Extrinsic Calibration
Daniele Cattaneo and Abhinav Valada
IEEE Transactions on Robotics, 2025.
If you use CMRNext, please cite:
@article{cattaneo2025cmrnext,
title={CMRNext: Camera to LiDAR Matching in the Wild for Localization and Extrinsic Calibration},
author={Daniele Cattaneo and Abhinav Valada},
journal={IEEE Transactions on Robotics},
year={2025},
volume={41},
pages={1995-2013},
doi={10.1109/TRO.2025.3546784}
}
- 2025/08/19 - Extrinsic calibration training
- 2025/05/07 - Localization inference code on Pandaset
- 2025/05/02 - We released the inference code for calibration on custom datasets, and we added some visualization.
- 2025/04/11 - We released the inference code for monocular localization on Argoverse
- 2025/03/26 — We released the inference code for extrinsic calibration.
- 2025/04/07 - We released the inference code for monocular localization on KITTI
- 2025/03/25 — We released the ROS code for camera-LiDAR extrinsic calibration on KITTI using MDPCalib, which combines CMRNext with graph optimization.
- Training code for localization
LiDARs are widely used for mapping and localization in dynamic environments. However, their high cost limits their widespread adoption. On the other hand, monocular localization in LiDAR maps using inexpensive cameras is a cost-effective alternative for large-scale deployment. Nevertheless, most existing approaches struggle to generalize to new sensor setups and environments, requiring retraining or fine-tuning. In this paper, we present CMRNext, a novel approach for camera-LIDAR matching that is independent of sensor-specific parameters, generalizable, and can be used in the wild for monocular localization in LiDAR maps and camera-LiDAR extrinsic calibration. CMRNext exploits recent advances in deep neural networks for matching cross-modal data and standard geometric techniques for robust pose estimation. We reformulate the point-pixel matching problem as an optical flow estimation problem and solve the Perspective-n-Point problem based on the resulting correspondences to find the relative pose between the camera and the LiDAR point cloud. We extensively evaluate CMRNext on six different robotic platforms, including three publicly available datasets and three in-house robots. Our experimental evaluations demonstrate that CMRNext outperforms existing approaches on both tasks and effectively generalizes to previously unseen environments and sensor setups in a zero-shot manner.
The code natively supports the following datasets: KITTI, Argoverse 1, and Pandaset.
Download the KITTI Odometry dataset available here.
All parts except for the grayscale images are required. Additionally, download the ground truth poses from SemanticKITTI, specifically the SemanticKITTI label data.
After extracting all files (e.g., in the folder \data\KITTI), the folder structure should look like
\data\KITTI\sequences
├── 00
│ ├── image_2
│ │ ├── 000000.png
│ │ ├── 000001.png
│ │ ├── ...
│ │ └── 004540.png
│ ├── image_3
│ │ ├── 000000.png
│ │ ├── 000001.png
│ │ ├── ...
│ │ └── 004540.png
│ ├── velodyne
│ │ ├── 000000.bin
│ │ ├── 000001.bin
│ │ ├── ...
│ │ └── 004540.bin
│ └── poses.txt
│ └── calib.txt
└── 01
├── ...
Download the four train sets of Argoverse 3D Tracking v1.1 from the official website.
Extract all files (e.g., in the folder \data\argoverse)
Unfortunately, the pandaset dataset is not available for download from their official website anymore. An alternative download is available at Kaggle, however I haven't tested it.
Please download the model weights for camera-LiDAR extrinsic calibration and for monocular localization in LiDAR maps and store them under: /data/.
- Calibration model weights: https://cmrnext.cs.uni-freiburg.de/downloads/cmrnext_calibration_weights.zip
- Localization model weights: https://cmrnext.cs.uni-freiburg.de/downloads/cmrnext_localization_weights.zip
The best way to run CMRNext is by using Docker. Given the endless variability in system setups, many things can go wrong when configuring the environment manually. If you choose not to use Docker, it may be difficult for me to help troubleshoot any issues you encounter.
Install Docker and NVIDIA Container Toolkit.
Tested with Docker version 28.0.1 and NVIDIA Container Toolkit version 1.17.5-1
Caution
The provided Docker container does NOT support NVIDIA RTX 50 Series (Blackwell). Adding support for it has low priority currently. Feel free to open a pull request.
- To build the image, run
docker build . -t cmrnextin the root of this repository. - Prepare using GUIs in the container:
xhost +local:docker. - Start container and mount the folder where your dataset are located:
docker run --runtime=nvidia --shm-size=2g -it -v /tmp/.X11-unix:/tmp/.X11-unix -v PATH_TO_DATA:/data -e DISPLAY -e XAUTHORITY -e NVIDIA_DRIVER_CAPABILITIES=all cmrnext- Within the container, move to the code folder
cd /root/CMRNext/
Run the inference for camera-LiDAR extrinsic calibration, assuming the weights and the datasets are located under \data\ , change the paths according to your setup. Click on the specific dataset to see the exact command to run.
KITTI left camera:
python3 evaluate_flow_calibration.py --weights /data/cmrnext-calib-LEnc-iter1.tar /data/cmrnext-calib-LEnc-iter5.tar /data/cmrnext-calib-LEnc-iter6.tar --data_folder /data/KITTI/sequences/ --dataset kittiExample results:
KITTI right camera:
python3 evaluate_flow_calibration.py --weights /data/cmrnext-calib-LEnc-iter1.tar /data/cmrnext-calib-LEnc-iter5.tar /data/cmrnext-calib-LEnc-iter6.tar --data_folder /data/KITTI/sequences/ --dataset kitti --cam 3Example results:
Argoverse V1:
python3 evaluate_flow_calibration.py --weights /data/cmrnext-calib-LEnc-iter1.tar /data/cmrnext-calib-LEnc-iter5.tar /data/cmrnext-calib-LEnc-iter6.tar --data_folder /data/argoverse/argoverse-tracking/ --dataset argoverseExample results:
Pandaset:
python3 evaluate_flow_calibration.py --weights /data/cmrnext-calib-LEnc-iter1.tar /data/cmrnext-calib-LEnc-iter5.tar /data/cmrnext-calib-LEnc-iter6.tar --data_folder /data/pandaset/ --dataset pandasetExample results:
To run extrinsic calibration on your own dataset, record multiple (the more, the better) synchronized image-point cloud pairs. The easiest way to do this, is to record one image-point cloud pair with the robot stationary, and repeat it at different locations.
Caution
If you decide to record data while the robot is moving, then you have to make sure that point clouds are undistorted (motion compensated), and time-synchronized to match the timestamp of the camera. If you don't do that, there is no way to recover an accurate extrinsic calibration.
Then, place your UNDISTORTED images in a folder named camera, and your point clouds in a folder named lidar.
The filenames of matching camera and lidar pairs must be the same (except fot the extension).
Additionally, create a file named calibration.yaml, where you have to provide intrinsic parameters of your camera, and an initial extrinsic transformation.
The goal of this transformation is mostly to change the axis of the LiDAR such that it face the same direction of the camera.
Without an appropriate initial calibration, the method will not work.
We provide a sample dataset recorded with our in-house robot in sample_custom_dataset. The final folder structure should look like:
\data\custom_dataset\
├── camera
│ ├── 000000.png
│ ├── 000001.png
│ ├── ...
├── lidar
│ ├── 000000.pcd
│ ├── 000001.pcd
│ ├── ...
└── calibration.yamlDifferent point clouds format are supported. After your data is ready, run CMRNext:
python3 evaluate_flow_calibration.py --weights /data/cmrnext-calib-LEnc-iter1.tar /data/cmrnext-calib-LEnc-iter5.tar /data/cmrnext-calib-LEnc-iter6.tar --data_folder ./sample_custom_dataset --dataset custom --num_worker 2 --quantile 1.0 --max_r 0 --max_t 0 --downsample --vizThe --viz is useful to visualize and debug, but will slow down the inference, you can remove it after checking that the initial projection and the predicted projection are as expected.
The --downsample argument should give better results for most use cases, but if your camera images looks similar to the ones in the KITTI dataset, then you might try removing the argument and check if the results are better.
To train the Camera-LiDAR extrinsic calibration models, run the following commands:
python3 train_calibration.py --savemodel /data/iter1/ --data_folder_argo /data/argoverse/argoverse-tracking/ --data_folder_kitti /data/KITTI/sequences/ --data_folder_panda /data/pandaset/ --max_r 20 --max_t 1.5
python3 train_calibration.py --savemodel /data/iter2/ --data_folder_argo /data/argoverse/argoverse-tracking/ --data_folder_kitti /data/KITTI/sequences/ --data_folder_panda /data/pandaset/ --max_r 1 --max_t 0.1
python3 train_calibration.py --savemodel /data/iter3/ --data_folder_argo /data/argoverse/argoverse-tracking/ --data_folder_kitti /data/KITTI/sequences/ --data_folder_panda /data/pandaset/ --max_r 0.2 --max_t 0.05This section contains the steps to perform inference for monocular localization in LiDAR maps. First, we need to create the LiDAR maps for each dataset.
To generate the LiDAR maps of the KITTI dataset, run the following commands one after the other:
python3 -m preprocess.kitti_maps_semantic --base_folder /data/KITTI/ --sequence 00
python3 -m preprocess.kitti_maps_semantic --base_folder /data/KITTI/ --sequence 03
python3 -m preprocess.kitti_maps_semantic --base_folder /data/KITTI/ --sequence 04
python3 -m preprocess.kitti_maps_semantic --base_folder /data/KITTI/ --sequence 05
python3 -m preprocess.kitti_maps_semantic --base_folder /data/KITTI/ --sequence 06
python3 -m preprocess.kitti_maps_semantic --base_folder /data/KITTI/ --sequence 07
python3 -m preprocess.kitti_maps_semantic --base_folder /data/KITTI/ --sequence 08
python3 -m preprocess.kitti_maps_semantic --base_folder /data/KITTI/ --sequence 09To generate the LiDAR maps of the Argoverse dataset, run the following command:
python3 -m preprocess.argoverse --base_folder /data/argoverse/argoverse-trackingTo generate the LiDAR maps of the Pandaset dataset, run the following command:
python3 -m preprocess.pandaset --base_folder /data/pandasetRun the inference for monocular localization in LiDAR maps within the Docker container, assuming the weights and the datasets are located under \data\ . Change the paths according to your setup. Click on the specific dataset to see the exact command to run.
KITTI:
python3 evaluate_flow_localization.py --weights /data/cmrnext-iter1.tar /data/cmrnext-iter2.tar /data/cmrnext-iter3.tar --data_folder /data/KITTI/sequences/ --dataset kittiArgoverse:
python3 evaluate_flow_localization.py --weights /data/cmrnext-iter1.tar /data/cmrnext-iter2.tar /data/cmrnext-iter3.tar --data_folder /data/argoverse/argoverse_tracking/ --dataset argoversePandaset:
python3 evaluate_flow_localization.py --weights /data/cmrnext-iter1.tar /data/cmrnext-iter2.tar /data/cmrnext-iter3.tar --data_folder /data/pandaset/ --dataset pandasetIn this section we cover common problems and how to solve them.
docker: Error response from daemon: Unknown runtime specified nvidia.
If you receive this error while starting the docker
docker: Error response from daemon: Unknown runtime specified nvidia.Run the docker with this command instead:
docker run --gpus=all --shm-size=2g -it -v /tmp/.X11-unix:/tmp/.X11-unix -v PATH_TO_DATA:/data -e DISPLAY -e XAUTHORITY -e NVIDIA_DRIVER_CAPABILITIES=all cmrnextIn our work and experiments, we have used components from other works. We thank the authors for open-sourcing their code.
- RAFT: https://github.com/princeton-vl/RAFT
- CMRNet: https://github.com/cattaneod/CMRNet
- https://github.com/dilaragokay/central-tendency-rotations/
- KISS-ICP: https://github.com/PRBonn/kiss-icp
- https://github.com/tomrunia/OpticalFlow_Visualization
For academic usage, the code is released under the GPLv3 license. For any commercial purpose, please contact the authors.





