Skip to content

Conversation

@AbVishwas
Copy link

PhysicsNeMo Pull Request

Description

  • Added 3D-EDMPrecond wrapper class with 3D version of SongUnet for diffusion models:
    • SongUNet3D: 3D U-Net diffusion backbone extending DDPM++ and NCSN++ to volumetric data
    • New test to cover the SongUnet3D
    • EDMPrecond3D: 3D preconditioning wrapper for volumetric diffusion models

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.
  • The CHANGELOG.md is up to date with these changes.
  • An issue is linked to this pull request.

Dependencies

No addition dependencies

Review Process

All PRs are reviewed by the PhysicsNeMo team before merging.

Depending on which files are changed, GitHub may automatically assign a maintainer for review.

We are also testing AI-based code review tools (e.g., Greptile), which may add automated comments with a confidence score.
This score reflects the AI’s assessment of merge readiness and is not a qualitative judgment of your work, nor is
it an indication that the PR will be accepted / rejected.

AI-generated feedback should be reviewed critically for usefulness.
You are not required to respond to every AI comment, but they are intended to help both authors and reviewers.
Please react to Greptile comments with 👍 or 👎 to provide feedback on their accuracy.

Abhijeet Vishwasrao and others added 2 commits November 7, 2025 23:00
…dels, utilized for urban flow project under NVIDIA grant

Signed-off-by: Abhijeet Vishwasrao <[email protected]>
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greptile Overview

Greptile Summary

This PR extends PhysicsNeMo's diffusion modeling capabilities to 3D volumetric data by adding SongUNet3D and EDMPrecond3D classes.

Key Changes:

  • SongUNet3D: A 904-line implementation of 3D U-Net diffusion backbone that extends DDPM++ and NCSN++ architectures to volumetric data (B, C, D, H, W tensors)

    • Includes custom Conv3d layer with fused up/downsampling using 3D separable filters
    • Custom GroupNorm implementation optimized for channels-last memory layout during inference
    • UNetBlock3D with 3D convolutions, self-attention on flattened spatial dimensions, and adaptive scaling
    • Supports flexible resolution (uniform or non-uniform D, H, W), multiple encoder/decoder types, and gradient checkpointing
  • EDMPrecond3D: Extends EDMPrecond parent class with proper 3D tensor handling

    • Key difference: sigma reshaping from (B, 1, 1, 1) to (B, 1, 1, 1, 1) for 5D tensors
    • Otherwise inherits all EDM preconditioning logic (c_skip, c_out, c_in, c_noise computations)
  • Test Coverage: Comprehensive test suite with 390 lines covering forward pass, constructor options, CUDA graphs, JIT, AMP, torch.compile, checkpointing, ONNX export, and gradient checkpointing for both DDPM++ and NCSN++ variants

Issues Found:

  • Typo in docstring: "itsuitable" should be "it suitable" (line 191 of song_unet3d.py)
  • CHANGELOG date is 2025-11-10, one day in the future

Confidence Score: 4/5

  • This PR is safe to merge with minor documentation fixes recommended.
  • The implementation is well-structured and follows the existing 2D SongUNet patterns appropriately extended to 3D. The EDMPrecond3D correctly inherits from EDMPrecond with only the necessary tensor dimension adjustments. Comprehensive test coverage includes multiple model variants and optimization paths. Only two minor issues found: a documentation typo and a future changelog date. No logical errors, security issues, or breaking changes detected.
  • All files are in good shape. Minor fixes needed in physicsnemo/models/diffusion/song_unet3d.py (typo) and CHANGELOG.md (date).

Important Files Changed

File Analysis

Filename Score Overview
physicsnemo/models/diffusion/song_unet3d.py 4/5 New 3D U-Net diffusion backbone extending DDPM++/NCSN++ to volumetric data. Includes Conv3d, GroupNorm, and UNetBlock3D with comprehensive documentation. Minor typo at line 191.
physicsnemo/models/diffusion/preconditioning.py 5/5 Added EDMPrecond3D class that extends EDMPrecond for 3D volumetric inputs. Properly adjusts sigma reshaping from (B,1,1,1) to (B,1,1,1,1) for 3D tensors. Clean implementation.
test/models/diffusion/test_song_unet3d.py 5/5 Comprehensive test suite covering forward pass, constructor options, optimizations (CUDA graphs, JIT, AMP, torch.compile), checkpointing, ONNX deployment, and gradient checkpointing for both DDPM++ and NCSN++ variants.
CHANGELOG.md 4/5 Updated changelog with new 3D diffusion features. Date shows 2025-11-10 which is one day in the future.

Sequence Diagram

sequenceDiagram
    participant User
    participant EDMPrecond3D
    participant SongUNet3D
    participant Conv3d
    participant UNetBlock3D
    participant GroupNorm
    
    User->>EDMPrecond3D: forward(x, sigma, condition, class_labels)
    Note over EDMPrecond3D: x shape: (B, C, D, H, W)
    EDMPrecond3D->>EDMPrecond3D: Compute preconditioning scales<br/>(c_skip, c_out, c_in, c_noise)
    EDMPrecond3D->>EDMPrecond3D: Reshape sigma to (B, 1, 1, 1, 1)
    EDMPrecond3D->>EDMPrecond3D: Concatenate condition if provided
    
    EDMPrecond3D->>SongUNet3D: model(arg, c_noise, class_labels)
    
    Note over SongUNet3D: Embedding Generation
    SongUNet3D->>SongUNet3D: Generate noise/class/augment embeddings
    
    Note over SongUNet3D: Encoder Pass
    loop Each encoder level
        SongUNet3D->>Conv3d: Initial convolution (level 0)
        SongUNet3D->>UNetBlock3D: Downsample (levels > 0)
        loop num_blocks times
            SongUNet3D->>UNetBlock3D: Process block with embeddings
            UNetBlock3D->>GroupNorm: Normalize features
            UNetBlock3D->>Conv3d: 3D convolution
            UNetBlock3D->>UNetBlock3D: Apply self-attention if at attn_resolution
        end
        SongUNet3D->>SongUNet3D: Cache features for skip connections
    end
    
    Note over SongUNet3D: Decoder Pass
    loop Each decoder level (reversed)
        SongUNet3D->>UNetBlock3D: Upsample
        SongUNet3D->>SongUNet3D: Concatenate skip connections
        loop num_blocks + 1 times
            SongUNet3D->>UNetBlock3D: Process block with embeddings
            UNetBlock3D->>GroupNorm: Normalize features
            UNetBlock3D->>Conv3d: 3D convolution
        end
    end
    
    SongUNet3D-->>EDMPrecond3D: F_x (denoised features)
    
    EDMPrecond3D->>EDMPrecond3D: D_x = c_skip * x + c_out * F_x
    EDMPrecond3D-->>User: D_x (denoised output)
Loading

7 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Group Normalization (GN) divides the channels of the input tensor into groups and
normalizes the features within each group independently. It does not require the
batch size as in Batch Normalization, making itsuitable for batch sizes of any size
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syntax: itsuitable should be it suitable (add space)

Suggested change
batch size as in Batch Normalization, making itsuitable for batch sizes of any size
making it suitable for batch sizes of any size

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Nov 11, 2025

Greptile Overview

Greptile Summary

This PR adds comprehensive 3D volumetric support to PhysicsNeMo's diffusion models by introducing SongUNet3D and EDMPrecond3D.

Key Changes:

  • New SongUNet3D class (903 lines) extends DDPM++ and NCSN++ architectures to 3D volumes with proper 3D convolution, group normalization, and self-attention operations
  • EDMPrecond3D wrapper properly handles 5D tensors (B,C,D,H,W) with correct sigma reshaping
  • Comprehensive test suite covering forward pass, constructors, optimizations (CUDA graphs, JIT, AMP), checkpointing, ONNX export, and gradient checkpointing
  • Well-documented with extensive docstrings following NumPy style
  • Supports flexible input shapes (cubic, rectangular, non-uniform D/H/W), gradient checkpointing, and optional positional embeddings

Minor Issues:

  • Duplicate ## [1.3.0a0] version headers in CHANGELOG.md (lines 9 and 18) need to be merged
  • Two previous comments already noted about typo (itsuitableit suitable) and future date in changelog

The implementation is clean, well-tested, and follows established patterns from the 2D version.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk - it adds new 3D functionality without modifying existing code
  • Score of 5 reflects high-quality implementation with comprehensive testing, excellent documentation, proper architectural design following existing patterns, and only trivial issues (duplicate CHANGELOG header and previously noted typo). The changes are purely additive with no breaking changes to existing functionality.
  • CHANGELOG.md needs the duplicate version header merged (minor fix)

Important Files Changed

File Analysis

Filename Score Overview
physicsnemo/models/diffusion/song_unet3d.py 5/5 New 3D U-Net implementation for volumetric diffusion. Well-documented with comprehensive docstrings, proper 3D convolution/attention operations, and gradient checkpointing support. Clean extension of 2D architecture.
physicsnemo/models/diffusion/preconditioning.py 5/5 Added EDMPrecond3D class that properly extends EDMPrecond for 3D volumes. Correctly reshapes sigma tensor to match 5D input (B,C,D,H,W). Implementation is clean and follows existing patterns.
test/models/diffusion/test_song_unet3d.py 5/5 Comprehensive test suite covering forward pass, constructor options, optimizations (CUDA graphs, JIT, AMP), checkpointing, deployment (ONNX), and gradient checkpointing. Tests both DDPM++ and NCSN++ variants.
CHANGELOG.md 4/5 Changelog updated to document new 3D diffusion models. Two entries for same version exist (duplicate 1.3.0a0 headers on lines 9 and 18).

Sequence Diagram

sequenceDiagram
    participant User
    participant EDMPrecond3D
    participant SongUNet3D
    participant Conv3d
    participant UNetBlock3D
    participant GroupNorm
    participant AttentionOp

    User->>EDMPrecond3D: forward(x, sigma, condition, class_labels)
    Note over EDMPrecond3D: Input: (B, C, D, H, W)
    EDMPrecond3D->>EDMPrecond3D: Compute preconditioning coefficients<br/>(c_skip, c_out, c_in, c_noise)
    EDMPrecond3D->>EDMPrecond3D: Reshape sigma to (B, 1, 1, 1, 1)
    EDMPrecond3D->>EDMPrecond3D: Concatenate condition if provided
    EDMPrecond3D->>SongUNet3D: model(c_in * x, c_noise, class_labels)
    
    SongUNet3D->>SongUNet3D: Generate noise/class embeddings
    Note over SongUNet3D: Encoder Phase
    loop For each resolution level
        SongUNet3D->>Conv3d: Initial 3D convolution
        Conv3d-->>SongUNet3D: Features
        SongUNet3D->>SongUNet3D: Add spatial_emb if additive_pos_embed
        loop For each block
            SongUNet3D->>UNetBlock3D: Process with emb
            UNetBlock3D->>GroupNorm: Normalize activations
            GroupNorm-->>UNetBlock3D: Normalized
            UNetBlock3D->>Conv3d: 3D convolution with up/down
            Conv3d-->>UNetBlock3D: Transformed features
            alt Has attention
                UNetBlock3D->>AttentionOp: Self-attention on 3D volume
                AttentionOp-->>UNetBlock3D: Attended features
            end
            UNetBlock3D-->>SongUNet3D: Block output
            SongUNet3D->>SongUNet3D: Store skip connection
        end
    end
    
    Note over SongUNet3D: Decoder Phase
    loop For each resolution level (reversed)
        SongUNet3D->>SongUNet3D: Concatenate skip connections
        loop For each block
            SongUNet3D->>UNetBlock3D: Process with emb and skips
            UNetBlock3D->>Conv3d: 3D upsampling/convolution
            Conv3d-->>UNetBlock3D: Upsampled features
            UNetBlock3D-->>SongUNet3D: Decoded features
        end
        SongUNet3D->>Conv3d: aux_conv for output
        Conv3d-->>SongUNet3D: Output volume
    end
    
    SongUNet3D-->>EDMPrecond3D: F_x (denoised output)
    EDMPrecond3D->>EDMPrecond3D: D_x = c_skip * x + c_out * F_x
    EDMPrecond3D-->>User: Denoised 3D volume (B, C, D, H, W)
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines +9 to 18
## [1.3.0a0] - 2025-11-10

### Added

- Added 3D-EDMPrecond wrapper class with 3D version of SongUnet for diffusion models:
- `SongUNet3D`: 3D U-Net diffusion backbone extending DDPM++ and NCSN++ to volumetric data
- New test to cover the SongUnet3D
- `EDMPrecond3D`: 3D preconditioning wrapper for volumetric diffusion models

## [1.3.0a0] - 2025-XX-YY
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syntax: duplicate version header - there are two ## [1.3.0a0] sections (lines 9 and 18), which should be merged into one

Suggested change
## [1.3.0a0] - 2025-11-10
### Added
- Added 3D-EDMPrecond wrapper class with 3D version of SongUnet for diffusion models:
- `SongUNet3D`: 3D U-Net diffusion backbone extending DDPM++ and NCSN++ to volumetric data
- New test to cover the SongUnet3D
- `EDMPrecond3D`: 3D preconditioning wrapper for volumetric diffusion models
## [1.3.0a0] - 2025-XX-YY
## [1.3.0a0] - 2025-XX-YY
### Added
- Added 3D-EDMPrecond wrapper class with 3D version of SongUnet for diffusion models:
- `SongUNet3D`: 3D U-Net diffusion backbone extending DDPM++ and NCSN++ to volumetric data
- New test to cover the SongUnet3D
- `EDMPrecond3D`: 3D preconditioning wrapper for volumetric diffusion models

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants