-
Notifications
You must be signed in to change notification settings - Fork 488
Feature/edm precond 3d #1220
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Feature/edm precond 3d #1220
Conversation
…dels, utilized for urban flow project under NVIDIA grant Signed-off-by: Abhijeet Vishwasrao <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greptile Overview
Greptile Summary
This PR extends PhysicsNeMo's diffusion modeling capabilities to 3D volumetric data by adding SongUNet3D and EDMPrecond3D classes.
Key Changes:
-
SongUNet3D: A 904-line implementation of 3D U-Net diffusion backbone that extends DDPM++ and NCSN++ architectures to volumetric data (B, C, D, H, W tensors)- Includes custom
Conv3dlayer with fused up/downsampling using 3D separable filters - Custom
GroupNormimplementation optimized for channels-last memory layout during inference UNetBlock3Dwith 3D convolutions, self-attention on flattened spatial dimensions, and adaptive scaling- Supports flexible resolution (uniform or non-uniform D, H, W), multiple encoder/decoder types, and gradient checkpointing
- Includes custom
-
EDMPrecond3D: ExtendsEDMPrecondparent class with proper 3D tensor handling- Key difference: sigma reshaping from
(B, 1, 1, 1)to(B, 1, 1, 1, 1)for 5D tensors - Otherwise inherits all EDM preconditioning logic (c_skip, c_out, c_in, c_noise computations)
- Key difference: sigma reshaping from
-
Test Coverage: Comprehensive test suite with 390 lines covering forward pass, constructor options, CUDA graphs, JIT, AMP, torch.compile, checkpointing, ONNX export, and gradient checkpointing for both DDPM++ and NCSN++ variants
Issues Found:
- Typo in docstring: "itsuitable" should be "it suitable" (line 191 of
song_unet3d.py) - CHANGELOG date is 2025-11-10, one day in the future
Confidence Score: 4/5
- This PR is safe to merge with minor documentation fixes recommended.
- The implementation is well-structured and follows the existing 2D SongUNet patterns appropriately extended to 3D. The EDMPrecond3D correctly inherits from EDMPrecond with only the necessary tensor dimension adjustments. Comprehensive test coverage includes multiple model variants and optimization paths. Only two minor issues found: a documentation typo and a future changelog date. No logical errors, security issues, or breaking changes detected.
- All files are in good shape. Minor fixes needed in
physicsnemo/models/diffusion/song_unet3d.py(typo) andCHANGELOG.md(date).
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| physicsnemo/models/diffusion/song_unet3d.py | 4/5 | New 3D U-Net diffusion backbone extending DDPM++/NCSN++ to volumetric data. Includes Conv3d, GroupNorm, and UNetBlock3D with comprehensive documentation. Minor typo at line 191. |
| physicsnemo/models/diffusion/preconditioning.py | 5/5 | Added EDMPrecond3D class that extends EDMPrecond for 3D volumetric inputs. Properly adjusts sigma reshaping from (B,1,1,1) to (B,1,1,1,1) for 3D tensors. Clean implementation. |
| test/models/diffusion/test_song_unet3d.py | 5/5 | Comprehensive test suite covering forward pass, constructor options, optimizations (CUDA graphs, JIT, AMP, torch.compile), checkpointing, ONNX deployment, and gradient checkpointing for both DDPM++ and NCSN++ variants. |
| CHANGELOG.md | 4/5 | Updated changelog with new 3D diffusion features. Date shows 2025-11-10 which is one day in the future. |
Sequence Diagram
sequenceDiagram
participant User
participant EDMPrecond3D
participant SongUNet3D
participant Conv3d
participant UNetBlock3D
participant GroupNorm
User->>EDMPrecond3D: forward(x, sigma, condition, class_labels)
Note over EDMPrecond3D: x shape: (B, C, D, H, W)
EDMPrecond3D->>EDMPrecond3D: Compute preconditioning scales<br/>(c_skip, c_out, c_in, c_noise)
EDMPrecond3D->>EDMPrecond3D: Reshape sigma to (B, 1, 1, 1, 1)
EDMPrecond3D->>EDMPrecond3D: Concatenate condition if provided
EDMPrecond3D->>SongUNet3D: model(arg, c_noise, class_labels)
Note over SongUNet3D: Embedding Generation
SongUNet3D->>SongUNet3D: Generate noise/class/augment embeddings
Note over SongUNet3D: Encoder Pass
loop Each encoder level
SongUNet3D->>Conv3d: Initial convolution (level 0)
SongUNet3D->>UNetBlock3D: Downsample (levels > 0)
loop num_blocks times
SongUNet3D->>UNetBlock3D: Process block with embeddings
UNetBlock3D->>GroupNorm: Normalize features
UNetBlock3D->>Conv3d: 3D convolution
UNetBlock3D->>UNetBlock3D: Apply self-attention if at attn_resolution
end
SongUNet3D->>SongUNet3D: Cache features for skip connections
end
Note over SongUNet3D: Decoder Pass
loop Each decoder level (reversed)
SongUNet3D->>UNetBlock3D: Upsample
SongUNet3D->>SongUNet3D: Concatenate skip connections
loop num_blocks + 1 times
SongUNet3D->>UNetBlock3D: Process block with embeddings
UNetBlock3D->>GroupNorm: Normalize features
UNetBlock3D->>Conv3d: 3D convolution
end
end
SongUNet3D-->>EDMPrecond3D: F_x (denoised features)
EDMPrecond3D->>EDMPrecond3D: D_x = c_skip * x + c_out * F_x
EDMPrecond3D-->>User: D_x (denoised output)
7 files reviewed, 2 comments
| Group Normalization (GN) divides the channels of the input tensor into groups and | ||
| normalizes the features within each group independently. It does not require the | ||
| batch size as in Batch Normalization, making itsuitable for batch sizes of any size |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
syntax: itsuitable should be it suitable (add space)
| batch size as in Batch Normalization, making itsuitable for batch sizes of any size | |
| making it suitable for batch sizes of any size |
Greptile OverviewGreptile SummaryThis PR adds comprehensive 3D volumetric support to PhysicsNeMo's diffusion models by introducing Key Changes:
Minor Issues:
The implementation is clean, well-tested, and follows established patterns from the 2D version. Confidence Score: 5/5
Important Files ChangedFile Analysis
Sequence DiagramsequenceDiagram
participant User
participant EDMPrecond3D
participant SongUNet3D
participant Conv3d
participant UNetBlock3D
participant GroupNorm
participant AttentionOp
User->>EDMPrecond3D: forward(x, sigma, condition, class_labels)
Note over EDMPrecond3D: Input: (B, C, D, H, W)
EDMPrecond3D->>EDMPrecond3D: Compute preconditioning coefficients<br/>(c_skip, c_out, c_in, c_noise)
EDMPrecond3D->>EDMPrecond3D: Reshape sigma to (B, 1, 1, 1, 1)
EDMPrecond3D->>EDMPrecond3D: Concatenate condition if provided
EDMPrecond3D->>SongUNet3D: model(c_in * x, c_noise, class_labels)
SongUNet3D->>SongUNet3D: Generate noise/class embeddings
Note over SongUNet3D: Encoder Phase
loop For each resolution level
SongUNet3D->>Conv3d: Initial 3D convolution
Conv3d-->>SongUNet3D: Features
SongUNet3D->>SongUNet3D: Add spatial_emb if additive_pos_embed
loop For each block
SongUNet3D->>UNetBlock3D: Process with emb
UNetBlock3D->>GroupNorm: Normalize activations
GroupNorm-->>UNetBlock3D: Normalized
UNetBlock3D->>Conv3d: 3D convolution with up/down
Conv3d-->>UNetBlock3D: Transformed features
alt Has attention
UNetBlock3D->>AttentionOp: Self-attention on 3D volume
AttentionOp-->>UNetBlock3D: Attended features
end
UNetBlock3D-->>SongUNet3D: Block output
SongUNet3D->>SongUNet3D: Store skip connection
end
end
Note over SongUNet3D: Decoder Phase
loop For each resolution level (reversed)
SongUNet3D->>SongUNet3D: Concatenate skip connections
loop For each block
SongUNet3D->>UNetBlock3D: Process with emb and skips
UNetBlock3D->>Conv3d: 3D upsampling/convolution
Conv3d-->>UNetBlock3D: Upsampled features
UNetBlock3D-->>SongUNet3D: Decoded features
end
SongUNet3D->>Conv3d: aux_conv for output
Conv3d-->>SongUNet3D: Output volume
end
SongUNet3D-->>EDMPrecond3D: F_x (denoised output)
EDMPrecond3D->>EDMPrecond3D: D_x = c_skip * x + c_out * F_x
EDMPrecond3D-->>User: Denoised 3D volume (B, C, D, H, W)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
7 files reviewed, 1 comment
| ## [1.3.0a0] - 2025-11-10 | ||
|
|
||
| ### Added | ||
|
|
||
| - Added 3D-EDMPrecond wrapper class with 3D version of SongUnet for diffusion models: | ||
| - `SongUNet3D`: 3D U-Net diffusion backbone extending DDPM++ and NCSN++ to volumetric data | ||
| - New test to cover the SongUnet3D | ||
| - `EDMPrecond3D`: 3D preconditioning wrapper for volumetric diffusion models | ||
|
|
||
| ## [1.3.0a0] - 2025-XX-YY |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
syntax: duplicate version header - there are two ## [1.3.0a0] sections (lines 9 and 18), which should be merged into one
| ## [1.3.0a0] - 2025-11-10 | |
| ### Added | |
| - Added 3D-EDMPrecond wrapper class with 3D version of SongUnet for diffusion models: | |
| - `SongUNet3D`: 3D U-Net diffusion backbone extending DDPM++ and NCSN++ to volumetric data | |
| - New test to cover the SongUnet3D | |
| - `EDMPrecond3D`: 3D preconditioning wrapper for volumetric diffusion models | |
| ## [1.3.0a0] - 2025-XX-YY | |
| ## [1.3.0a0] - 2025-XX-YY | |
| ### Added | |
| - Added 3D-EDMPrecond wrapper class with 3D version of SongUnet for diffusion models: | |
| - `SongUNet3D`: 3D U-Net diffusion backbone extending DDPM++ and NCSN++ to volumetric data | |
| - New test to cover the SongUnet3D | |
| - `EDMPrecond3D`: 3D preconditioning wrapper for volumetric diffusion models |
PhysicsNeMo Pull Request
Description
SongUNet3D: 3D U-Net diffusion backbone extending DDPM++ and NCSN++ to volumetric dataEDMPrecond3D: 3D preconditioning wrapper for volumetric diffusion modelsChecklist
Dependencies
No addition dependencies
Review Process
All PRs are reviewed by the PhysicsNeMo team before merging.
Depending on which files are changed, GitHub may automatically assign a maintainer for review.
We are also testing AI-based code review tools (e.g., Greptile), which may add automated comments with a confidence score.
This score reflects the AI’s assessment of merge readiness and is not a qualitative judgment of your work, nor is
it an indication that the PR will be accepted / rejected.
AI-generated feedback should be reviewed critically for usefulness.
You are not required to respond to every AI comment, but they are intended to help both authors and reviewers.
Please react to Greptile comments with 👍 or 👎 to provide feedback on their accuracy.