diff --git a/log.md b/log.md new file mode 100644 index 0000000..b02bd8d --- /dev/null +++ b/log.md @@ -0,0 +1,178 @@ +# How2comm Inference Log + +**Date:** 2026-03-14 +**Goal:** Run pretrained model inference on V2XSet and OPV2V datasets. + +--- + +## Environment + +- OS: Linux (Ubuntu), CUDA 12.8 on system +- Conda environment: `How2comm` at `/data/common/conda_envs_shared/yitonglin/How2comm` +- Python: 3.9 +- PyTorch: 2.5.1+cu124 +- spconv: spconv-cu124 + +--- + +## Prerequisites + +### 1. Pretrained Models + +Download pretrained models from Google Drive or Baidu Disk (links in README.md). + +Place files as follows: +``` +v2xvit/logs/how2comm/ + config.yaml + net_epoch36.pth # OPV2V pretrained model + +v2xvit/logs/how2comm_v2xset/ + config.yaml + net_epoch32.pth # V2XSet pretrained model +``` + +### 2. Datasets + +Place datasets at the repo root: +``` +opv2v/ + train/ + validate/ + test/ + +v2xset/ + train/ + validate/ + test/ +``` + +The `validate_dir` in the config files should point to the `test/` split: +- `v2xvit/logs/how2comm/config.yaml` → `validate_dir: /data/opv2v/test` +- `v2xvit/logs/how2comm_v2xset/config.yaml` → `validate_dir: /home/yitonglin/OpenCOOD/centerformer/How2comm/v2xset/test` + +--- + +## Setup Steps + +### Step 1: Install PyTorch with CUDA support (PyTorch 是深度学习框架:张量计算,自动求导,神经网络搭建,模型训练,模型推理) + +The environment must have a CUDA-enabled PyTorch. CPU-only PyTorch will fail. + +```sh +pip install --force-reinstall torch==2.5.1+cu124 torchvision==0.20.1+cu124 \ + --index-url https://download.pytorch.org/whl/cu124 +``` + +Verify: +```sh +python -c "import torch; print(torch.__version__, torch.version.cuda, torch.cuda.is_available())" +# Expected: 2.5.1+cu124 12.4 True +``` + +### Step 2: Install spconv (spconv 是 3D 点云任务常见的稀疏卷积库,主要用于 3D 点云、体素化 voxel、BEV 检测这类任务,cumm是spconv 底层依赖的 CUDA/C++ 内核生成和算子支持库。如果import 的是 spconv,但运行时如果缺底层依赖,就会报:ModuleNotFoundError: No module named 'cumm') + +```sh +pip install spconv-cu124 +``` + +### Step 3: Install other dependencies (icecream: 调试打印工具; open3d: 点云处理/可视化相关依赖) + +```sh +pip install icecream +pip install open3d +``` + +### Step 4: Install the v2xvit package + +From the repo root: +```sh +pip install -e . --config-settings editable_mode=compat +``` + +### Step 5: Build MultiScaleDeformableAttention (MSDA) extension (Attention:注意力机制 +Deformable:可变形/稀疏采样式注意力; MultiScale:多尺度,该项目中用于特征融合,BEV/多车特征交互,backbone/neck/head 中某个检测模块) + +This requires CUDA-enabled PyTorch (Step 1 must be done first). + +```sh +python /path/to/centerformer/det3d/models/ops/setup.py build install +``` + +> **Note:** Replace `/path/to/centerformer` with the actual path to the centerformer repo on your machine. In our setup it is `/home/yitonglin/OpenCOOD/centerformer`. + +### Step 6: Build box_overlaps Cython extension (两个框的重叠程度) + +```sh +python v2xvit/utils/setup.py build_ext --inplace +``` + +### Step 7: Fix NumPy 2.0 incompatibility + +`np.Inf` was removed in NumPy 2.0. Patch `v2xvit/utils/pcd_utils.py` line 189: + +```python +# Change: +minimum = np.Inf +# To: +minimum = np.inf +``` + +This has already been applied in the current codebase. + +--- + +## Running Inference + +Because `libc10.so` (PyTorch's C++ library) is not in the default library path, you must set `LD_LIBRARY_PATH` and `PYTHONPATH` when running inference. + +Use the following command template: + +```sh +conda run -p /data/common/conda_envs_shared/yitonglin/How2comm bash -c \ + "export LD_LIBRARY_PATH=/data/common/conda_envs_shared/yitonglin/How2comm/lib/python3.9/site-packages/torch/lib:\$LD_LIBRARY_PATH && \ + PYTHONPATH=/home/yitonglin/OpenCOOD/centerformer/How2comm:\$PYTHONPATH \ + python v2xvit/tools/inference.py --model_dir --eval_epoch " +``` + +### V2XSet + +```sh +conda run -p /data/common/conda_envs_shared/yitonglin/How2comm bash -c \ + "export LD_LIBRARY_PATH=/data/common/conda_envs_shared/yitonglin/How2comm/lib/python3.9/site-packages/torch/lib:\$LD_LIBRARY_PATH && \ + PYTHONPATH=/home/yitonglin/OpenCOOD/centerformer/How2comm:\$PYTHONPATH \ + python v2xvit/tools/inference.py --model_dir v2xvit/logs/how2comm_v2xset --eval_epoch 32" +``` + +**Result:** +``` +Epoch: 32 | AP @0.3: 0.8412 | AP @0.5: 0.8178 | AP @0.7: 0.6494 | comm_rate: 0.139198 +``` + +### OPV2V + +```sh +conda run -p /data/common/conda_envs_shared/yitonglin/How2comm bash -c \ + "export LD_LIBRARY_PATH=/data/common/conda_envs_shared/yitonglin/How2comm/lib/python3.9/site-packages/torch/lib:\$LD_LIBRARY_PATH && \ + PYTHONPATH=/home/yitonglin/OpenCOOD/centerformer/How2comm:\$PYTHONPATH \ + python v2xvit/tools/inference.py --model_dir v2xvit/logs/how2comm --eval_epoch 36" +``` + +**Result:** +``` +Epoch: 36 | AP @0.3: 0.8735 | AP @0.5: 0.8573 | AP @0.7: 0.7138 | comm_rate: 0.142713 +``` + +--- + +## Common Errors & Fixes + +| Error | Cause | Fix | +|-------|-------|-----| +| `ModuleNotFoundError: No module named 'torch'` | Wrong conda env active | Use `conda run -p ...` as shown above | +| `ModuleNotFoundError: No module named 'cumm'` | spconv not installed | `pip install spconv-cu124` | +| `ModuleNotFoundError: No module named 'MultiScaleDeformableAttention'` | MSDA extension not built | Run Step 5 above | +| `NotImplementedError: Cuda is not availabel` | PyTorch installed without CUDA | Run Step 1 to reinstall with CUDA | +| `AttributeError: np.Inf was removed in NumPy 2.0` | NumPy 2.0 incompatibility | Apply Step 7 patch | +| `ImportError: libc10.so: cannot open shared object file` | torch lib not in `LD_LIBRARY_PATH` | Use the full command with `LD_LIBRARY_PATH` as shown above | +| `ModuleNotFoundError: No module named 'v2xvit.models.comm_modules'` | Stale site-packages install | Run Step 4 with `PYTHONPATH` set as shown above |