Add CPU inference support with automatic CUDA fallback #13

tony2037 · 2025-11-10T20:59:45Z

This commit adds comprehensive CPU support for inference operations while maintaining full backward compatibility with CUDA workflows. The changes enable ChronoEdit to run on systems without GPU access by providing graceful fallback mechanisms.

Key Changes

1. New Central Device Management Module (chronoedit/utils/device_utils.py)

Implements get_device() for automatic device detection with CUDA->CPU fallback
Provides get_device_type() for device type string resolution
Adds get_device_map() for HuggingFace model compatibility
Emits clear warnings when falling back to CPU
Handles device validation and error cases gracefully

2. Prompt Enhancer Updates (scripts/prompt_enhancer.py)

Added device parameter to load_model() function
Updated pick_attn_implementation() to handle CPU-only mode
Uses device_utils for consistent device management
Automatically selects 'eager' attention for CPU (required)
Supports explicit device specification or auto-detection

3. Inference Script Enhancements (scripts/run_inference_diffusers.py)

Integrated device_utils for robust device handling
Updated device parameter flow throughout pipeline
Fixed generator device compatibility
Passes device parameter to prompt enhancer
Maintains existing --device CLI argument with enhanced fallback

4. Pipeline CPU Compatibility (chronoedit_diffusers/pipeline_chronoedit.py)

Wrapped torch.cuda.empty_cache() calls with availability checks
All three cache clearing locations now check torch.cuda.is_available()
Prevents crashes when running on CPU-only systems
No functional changes for CUDA users

5. Device Utility Hardening (chronoedit/_ext/imaginaire/utils/device.py)

Made pynvml import optional (try/except with PYNVML_AVAILABLE flag)
Updated all GPU-related functions to check CUDA availability
get_gpu_architecture() returns None for CPU mode
print_gpu_mem() provides informative message for CPU mode
gpu0_has_80gb_or_less() returns conservative default on CPU
Device class raises clear error when instantiated without CUDA

6. Test Infrastructure

Created test_cpu_inference.sh for automated testing
Configures HF_HOME to avoid disk space issues
Activates conda environment properly
Validates PyTorch and CUDA availability before running
Tests full inference pipeline with minimal steps for speed

Design Principles

Minimal Changes: Centralized device management reduces scattered modifications
Backward Compatible: No breaking changes to existing CUDA workflows
Graceful Degradation: CPU fallback with clear user warnings
Clean Separation: Training remains GPU-only, inference supports both
Maintainable: Single source of truth for device detection

Usage Examples

Auto-detect (CUDA if available, else CPU with warning) python scripts/run_inference_diffusers.py --input image.png --prompt "..." --output out.mp4

Explicit CPU

python scripts/run_inference_diffusers.py --device cpu --input image.png --prompt "..." --output out.mp4

Explicit CUDA (with auto-fallback to CPU if unavailable) python scripts/run_inference_diffusers.py --device cuda --input image.png --prompt "..." --output out.mp4

Testing

Run the test script to verify CPU inference:
bash test_cpu_inference.sh cpu

Notes

Training functionality remains GPU-only (FSDP, distributed, etc.)
CPU inference will be significantly slower than CUDA
Flash attention automatically disabled on CPU (uses eager mode)
All pynvml-dependent functions gracefully handle absence of GPU

Files Modified

chronoedit/utils/init.py (new)
chronoedit/utils/device_utils.py (new)
chronoedit/_ext/imaginaire/utils/device.py
chronoedit_diffusers/pipeline_chronoedit.py
scripts/prompt_enhancer.py
scripts/run_inference_diffusers.py
test_cpu_inference.sh (new)

Tested on: PyTorch 2.7.1+cu126 with CUDA unavailable
Environment: chronoedit_mini conda environment

This commit adds comprehensive CPU support for inference operations while maintaining full backward compatibility with CUDA workflows. The changes enable ChronoEdit to run on systems without GPU access by providing graceful fallback mechanisms. ## Key Changes ### 1. New Central Device Management Module (chronoedit/utils/device_utils.py) - Implements get_device() for automatic device detection with CUDA->CPU fallback - Provides get_device_type() for device type string resolution - Adds get_device_map() for HuggingFace model compatibility - Emits clear warnings when falling back to CPU - Handles device validation and error cases gracefully ### 2. Prompt Enhancer Updates (scripts/prompt_enhancer.py) - Added device parameter to load_model() function - Updated pick_attn_implementation() to handle CPU-only mode - Uses device_utils for consistent device management - Automatically selects 'eager' attention for CPU (required) - Supports explicit device specification or auto-detection ### 3. Inference Script Enhancements (scripts/run_inference_diffusers.py) - Integrated device_utils for robust device handling - Updated device parameter flow throughout pipeline - Fixed generator device compatibility - Passes device parameter to prompt enhancer - Maintains existing --device CLI argument with enhanced fallback ### 4. Pipeline CPU Compatibility (chronoedit_diffusers/pipeline_chronoedit.py) - Wrapped torch.cuda.empty_cache() calls with availability checks - All three cache clearing locations now check torch.cuda.is_available() - Prevents crashes when running on CPU-only systems - No functional changes for CUDA users ### 5. Device Utility Hardening (chronoedit/_ext/imaginaire/utils/device.py) - Made pynvml import optional (try/except with PYNVML_AVAILABLE flag) - Updated all GPU-related functions to check CUDA availability - get_gpu_architecture() returns None for CPU mode - print_gpu_mem() provides informative message for CPU mode - gpu0_has_80gb_or_less() returns conservative default on CPU - Device class raises clear error when instantiated without CUDA ### 6. Test Infrastructure - Created test_cpu_inference.sh for automated testing - Configures HF_HOME to avoid disk space issues - Activates conda environment properly - Validates PyTorch and CUDA availability before running - Tests full inference pipeline with minimal steps for speed ## Design Principles 1. **Minimal Changes**: Centralized device management reduces scattered modifications 2. **Backward Compatible**: No breaking changes to existing CUDA workflows 3. **Graceful Degradation**: CPU fallback with clear user warnings 4. **Clean Separation**: Training remains GPU-only, inference supports both 5. **Maintainable**: Single source of truth for device detection ## Usage Examples # Auto-detect (CUDA if available, else CPU with warning) python scripts/run_inference_diffusers.py --input image.png --prompt "..." --output out.mp4 # Explicit CPU python scripts/run_inference_diffusers.py --device cpu --input image.png --prompt "..." --output out.mp4 # Explicit CUDA (with auto-fallback to CPU if unavailable) python scripts/run_inference_diffusers.py --device cuda --input image.png --prompt "..." --output out.mp4 ## Testing Run the test script to verify CPU inference: bash test_cpu_inference.sh cpu ## Notes - Training functionality remains GPU-only (FSDP, distributed, etc.) - CPU inference will be significantly slower than CUDA - Flash attention automatically disabled on CPU (uses eager mode) - All pynvml-dependent functions gracefully handle absence of GPU ## Files Modified - chronoedit/utils/__init__.py (new) - chronoedit/utils/device_utils.py (new) - chronoedit/_ext/imaginaire/utils/device.py - chronoedit_diffusers/pipeline_chronoedit.py - scripts/prompt_enhancer.py - scripts/run_inference_diffusers.py - test_cpu_inference.sh (new) Tested on: PyTorch 2.7.1+cu126 with CUDA unavailable Environment: chronoedit_mini conda environment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add CPU inference support with automatic CUDA fallback #13

Add CPU inference support with automatic CUDA fallback #13

Uh oh!

tony2037 commented Nov 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add CPU inference support with automatic CUDA fallback #13

Are you sure you want to change the base?

Add CPU inference support with automatic CUDA fallback #13

Uh oh!

Conversation

tony2037 commented Nov 10, 2025

Key Changes

1. New Central Device Management Module (chronoedit/utils/device_utils.py)

2. Prompt Enhancer Updates (scripts/prompt_enhancer.py)

3. Inference Script Enhancements (scripts/run_inference_diffusers.py)

4. Pipeline CPU Compatibility (chronoedit_diffusers/pipeline_chronoedit.py)

5. Device Utility Hardening (chronoedit/_ext/imaginaire/utils/device.py)

6. Test Infrastructure

Design Principles

Usage Examples

Auto-detect (CUDA if available, else CPU with warning) python scripts/run_inference_diffusers.py --input image.png --prompt "..." --output out.mp4

Explicit CPU

Explicit CUDA (with auto-fallback to CPU if unavailable) python scripts/run_inference_diffusers.py --device cuda --input image.png --prompt "..." --output out.mp4

Testing

Notes

Files Modified

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant