Skip to content

Fix SAM-2 API compatibility issue in preprocessing pipeline#158

Merged
kelseyee merged 1 commit intoWan-Video:mainfrom
ishaangupta-YB:main
Nov 14, 2025
Merged

Fix SAM-2 API compatibility issue in preprocessing pipeline#158
kelseyee merged 1 commit intoWan-Video:mainfrom
ishaangupta-YB:main

Conversation

@ishaangupta-YB
Copy link
Contributor

issue
The preprocessing pipeline for replacement mode was failing with KeyError: 'frames_tracked_per_obj' #157

sam-2's API expect the frames_tracked_per_obj dictionary in the inference state, but video_predictor.py wasn't initializing this required key

added initialization of frames_tracked_per_obj as an empty dictionary:

inference_state["frames_tracked_per_obj"] = {}

Works fine now, see below Screenshot of output
Screenshot 2025-09-20 at 5 14 16 AM

Kindly merge this asap :)
@WanX-Video-1 @suruoxi @Steven-SWZhang

…issing frames_tracked_per_obj dictionary in inference state
@bekhzod2025
Copy link

Hello everyone,

I encountered the same KeyError: 'frames_tracked_per_obj' when running the preprocess_data.py script in replace_flag mode. After some investigation, it appears this issue is related to an underlying incompatibility with the sam-2 package, which can manifest as either the KeyError or an ImportError: undefined symbol depending on the environment.

I was able to resolve this on my system (Ubuntu 22.04, Python 3.12, CUDA 12.1, RTX 4090) by ensuring sam-2 was compiled from source with the correct toolchain and then rebuilding its C++ extension in place.

Here is the full, validated procedure that solved the problem for me:

Solution: Recompile sam-2 from Source

  1. Set Up the Build Environment

First, ensure you have the necessary compiler and build tools. I used GCC/G++ version 12 to match the CUDA toolkit requirements.

# Update package lists
sudo apt update

# Install GCC-12, G++-12, and ninja
sudo apt install -y gcc-12 g++-12 ninja-build

# Upgrade pip and core build packages
pip install -U pip wheel setuptools numpy

  1. Configure Environment Variables

Next, configure your shell environment to point to the correct compiler and CUDA paths. This step is critical for ensuring the sam-2 package compiles against the right libraries.

# Set the C/C++ compiler
export CC=/usr/bin/gcc-12
export CXX=/usr/bin/g++-12

# Set CUDA home and update PATH
export CUDA_HOME=/usr/local/cuda-12.1
export PATH=$CUDA_HOME/bin:$PATH

# Add CUDA and PyTorch libraries to LD_LIBRARY_PATH
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CUDA_HOME/lib64
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$(
python - <<'PY'
import site,os
sp=site.getsitepackages()[0]
print(":".join(
  p for p in [
    os.path.join(sp,'nvidia','cublas','lib'),
    os.path.join(sp,'nvidia','cudnn','lib'),
    os.path.join(sp,'nvidia','cuda_runtime','lib'),
    os.path.join(sp,'torch','lib'),
  ] if os.path.isdir(p)
))
PY
)

# Specify the target CUDA architecture for your GPU (RTX 4090 is 8.9)
export TORCH_CUDA_ARCH_LIST="8.9"

# (Optional but recommended) Allow newer compilers if needed
export TORCH_NVCC_FLAGS="--allow-unsupported-compiler"

  1. Install sam-2 from Source
PIP_NO_BUILD_ISOLATION=1 pip install -e \
  "git+https://github.com/facebookresearch/sam2.git@0e78a118995e66bb27d78518c4bd9a3e95b4e266#egg=SAM-2" \
  --no-cache-dir
  1. Build the C++ Extension In-Place
# Navigate to the source directory inside your virtual environment
# Replace 'YOUR_VIR_ENV' with the path to your environment (e.g., venv, conda envs)
cd YOUR_VIR_ENV/src/sam-2/

# Build the extension in place
python setup.py build_ext --inplace
  1. Re-run the Preprocessing Script
cd ~/Desktop/Wan2.2 # Or your project's root directory
python wan/modules/animate/preprocess/preprocess_data.py \
  --ckpt_path Wan2.2-Animate-14B/process_checkpoint \
  --video_path examples/wan_animate/replace/video.mp4 \
  --refer_path examples/wan_animate/replace/image.jpeg \
  --save_path examples/wan_animate/replace/process_results \
  --resolution_area 1280 720 --iterations 3 --k 7 --w_len 1 --h_len 1 --replace_flag
image

I hope this detailed walkthrough helps others in the community facing similar issues. This process ensures that all components are compiled and linked correctly, resolving the underlying library conflicts.

@yarodevuci
Copy link

When this can be merged?

@bekhzod2025
Copy link

When this can be merged?

Hi @yarodevuci, thank you for bringing up the question. I have submitted a pull request with a detailed guide to resolve this issue. You can track the progress here:

PR #187: docs: Fix SAM-2 build issues on Ubuntu 22.04 with Python 3.12

This PR adds the necessary build steps to the README.md to fix the sam-2 compatibility errors. Once the maintainers review and merge it, the fix will be part of the official documentation.

@chiyin0207
Copy link

问题替换模式的预处理管道失败,并显示 KeyError: 'frames_tracked_per_obj' #157

sam-2 的 API 期望字典处于推理状态,但未初始化此必需键frames_tracked_per_obj``video_predictor.py

添加了作为空字典的初始化:frames_tracked_per_obj

inference_state["frames_tracked_per_obj"] = {}

现在工作正常,见下文输出截图 截图 2025-09-20 at 5 14 16 AM

请尽快合并此:)

it works!thanks

@linfan
Copy link

linfan commented Oct 4, 2025

it works for me like a magic, hope this PR get merged soon.

@doubleZ0108
Copy link

issue The preprocessing pipeline for replacement mode was failing with KeyError: 'frames_tracked_per_obj' #157

sam-2's API expect the frames_tracked_per_obj dictionary in the inference state, but video_predictor.py wasn't initializing this required key

added initialization of frames_tracked_per_obj as an empty dictionary:

inference_state["frames_tracked_per_obj"] = {}

Works fine now, see below Screenshot of output Screenshot 2025-09-20 at 5 14 16 AM

Kindly merge this asap :) @WanX-Video-1 @suruoxi @Steven-SWZhang

Add this line of code to both init_state() and init_state_v2() at class SAM2VideoPredictor Wan2.2/wan/modules/animate/preprocess/video_predictor.py.

It works, thanks!

@kelseyee kelseyee merged commit e978357 into Wan-Video:main Nov 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants