Improve lead time support for diffusion models #980

jleinonen · 2025-06-17T20:07:34Z

PhysicsNeMo Pull Request

Description

Adds/fixes support for lead-time labels in various places where it was missing or not working:

SongUNetPosEmbd now works properly using either, both or neither of positional embedding and lead-time embedding. In the previous version some pieces of code could try to access properties of these even when set to None.
deterministic_sampler now accepts lead-time labels and passes them through to the model, if given.
EDMLoss also now supports lead-time labels.
Added tests for the above features.

Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.
The CHANGELOG.md is up to date with these changes.
An issue is linked to this pull request.

Dependencies

No new dependencies needed.

jleinonen · 2025-06-17T20:08:11Z

/blossom-ci

jleinonen · 2025-06-17T20:34:54Z

/blossom-ci

jleinonen · 2025-06-18T09:30:01Z

/blossom-ci

jleinonen · 2025-06-25T12:51:49Z

/blossom-ci

jleinonen · 2025-06-25T13:41:00Z

/blossom-ci

Signed-off-by: Charlelie Laurent <[email protected]>

CharlelieLrt

All changes proposed look good to me. Just a few details would require some improvements:

As far as I understand, this PR decouples lead-time embeddings from positional embeddings, in order to allow more flexibility in using them independently from each other. This new functionality does not seem to be used in any of the training recipes/examples. It could be useful to detail in the PR description the broader context (e.g. which applications is it going to be applied to? Will there be a follow-up PR? etc...)
The new flexibility to independently use lead-time and positional embeddings should be clearly explained in the docstrings.
IMO the current implementation of the lead-time embeddings has too many failure modes to be safely exposed to broader applications. For example, in positional_embedding_indexing:

lead_time_label can be done while self.lt_embd is not None, which leads to an error
Conversely, lead_time_label could be a user-provided tensor, while self.lt_embd is None, which leads to lead_time_label being silently ignored.

I strongly support better parameters validation to eliminate this failure modes, either in the forward method or the __Init__ when possible.

physicsnemo/metrics/diffusion/loss.py

physicsnemo/utils/diffusion/deterministic_sampler.py

physicsnemo/models/diffusion/song_unet.py

physicsnemo/metrics/diffusion/loss.py

jleinonen · 2025-07-10T19:16:44Z

Hi @CharlelieLrt,

As far as I understand, this PR decouples lead-time embeddings from positional embeddings, in order to allow more flexibility in using them independently from each other. This new functionality does not seem to be used in any of the training recipes/examples. It could be useful to detail in the PR description the broader context (e.g. which applications is it going to be applied to? Will there be a follow-up PR? etc...)

I would say lead-time embeddings were already decoupled from positional embeddings before the PR. This PR just includes some fixes to make sure that they can be enabled when positional embeddings are disabled, or vice versa.

The new flexibility to independently use lead-time and positional embeddings should be clearly explained in the docstrings.

As they were already implemented independently, I don't think it's a new flexibility, but I can improve the docstrings in that regard.

IMO the current implementation of the lead-time embeddings has too many failure modes to be safely exposed to broader applications. For example, in positional_embedding_indexing:

lead_time_label can be done while self.lt_embd is not None, which leads to an error

Conversely, lead_time_label could be a user-provided tensor, while self.lt_embd is None, which leads to lead_time_label being silently ignored.

I strongly support better parameters validation to eliminate this failure modes, either in the forward method or the __Init__ when possible.

I'll add some checks to make sure the inputs conform with the model configuration (but note that as far as I understand, these failure modes already existed before the PR).

jleinonen · 2025-07-14T13:19:37Z

/blossom-ci

physicsnemo/models/diffusion/song_unet.py

physicsnemo/metrics/diffusion/loss.py

jleinonen · 2025-08-07T16:49:26Z

/blossom-ci

Signed-off-by: Charlelie Laurent <[email protected]>

…e code Signed-off-by: Charlelie Laurent <[email protected]>

…bels Signed-off-by: Charlelie Laurent <[email protected]>

Signed-off-by: Charlelie Laurent <[email protected]>

…ime models Signed-off-by: Charlelie Laurent <[email protected]>

Signed-off-by: Charlelie Laurent <[email protected]>

CharlelieLrt

LGTM

CharlelieLrt · 2025-08-08T19:02:43Z

/blossom-ci

…ts accordingly Signed-off-by: Charlelie Laurent <[email protected]>

CharlelieLrt · 2025-08-09T01:16:38Z

/blossom-ci

jleinonen added 2 commits June 17, 2025 11:56

Improve lead time support for diffusion models

bbaea98

Update changelog

232c29c

jleinonen self-assigned this Jun 17, 2025

jleinonen requested review from CharlelieLrt and pzharrington June 17, 2025 20:07

jleinonen added 3 commits June 17, 2025 13:21

Add back mistakenly removed docstrings and type hints

8b94e59

Revert a couple more unintended changes

f3338b6

Fix type hint of lead time label

7ba0563

Fix deterministic samples to allow CorrDiff tests to pass

11e4ea5

jleinonen added 2 commits June 25, 2025 05:39

Rename utils.generative to utils.diffusion

6f08b02

Merge main and resolve conflicts

63b70c9

jleinonen added 4 commits June 25, 2025 06:30

Add back __init__.py in generative

fd35097

Revert unnecessary changes

ef8a9a6

Revert unnecessary changes

d0b1bfb

Revert unnecessary changes

4d503c4

jleinonen and others added 4 commits July 2, 2025 08:56

Merge branch 'NVIDIA:main' into leadtime-fixes

77a5cde

Merge branch 'NVIDIA:main' into leadtime-fixes

e600803

Merge branch 'main' into leadtime-fixes

4929596

Minor docstring improvement in SongUNetPosEmdb

457d50d

Signed-off-by: Charlelie Laurent <[email protected]>

CharlelieLrt requested changes Jul 10, 2025

View reviewed changes

pzharrington reviewed Jul 10, 2025

View reviewed changes

physicsnemo/metrics/diffusion/loss.py Outdated Show resolved Hide resolved

jleinonen added 3 commits July 11, 2025 08:54

Add value checks and docstrings

522440b

Update docstrings, add error condition

fd96890

Update docstrings

29ca853

CharlelieLrt reviewed Jul 15, 2025

View reviewed changes

physicsnemo/models/diffusion/song_unet.py Show resolved Hide resolved

CharlelieLrt reviewed Jul 16, 2025

View reviewed changes

physicsnemo/models/diffusion/song_unet.py Outdated Show resolved Hide resolved

CharlelieLrt reviewed Jul 17, 2025

View reviewed changes

physicsnemo/metrics/diffusion/loss.py Outdated Show resolved Hide resolved

jleinonen added 2 commits August 7, 2025 06:45

Fix lead time tests

e87c4c5

Fix tests after merge

e2fdd39

jleinonen force-pushed the leadtime-fixes branch from 58974a1 to e2fdd39 Compare August 7, 2025 15:40

jleinonen and others added 2 commits August 7, 2025 17:41

Merge branch 'NVIDIA:main' into leadtime-fixes

6cec691

Update docstring

bf461a8

jleinonen and others added 11 commits August 8, 2025 06:32

Change super().__init__ to use keyword args

89a5bd1

Minor formatting in deterministic_sampler docstring

2c0cd29

Signed-off-by: Charlelie Laurent <[email protected]>

Minor renaming and formatting in loss.py

77b162b

Signed-off-by: Charlelie Laurent <[email protected]>

Removed dtype casting of pos_emb in SongUNetPosEmbd

761468f

Signed-off-by: Charlelie Laurent <[email protected]>

Removed duplicate code in SongUNetPosEmbd.positional_embedding_indexing

948063e

Signed-off-by: Charlelie Laurent <[email protected]>

Refactor positional_embedding_indexing to eliminate dead and duplicat…

58c55af

…e code Signed-off-by: Charlelie Laurent <[email protected]>

Refactor positional_embedding_selector to enable batched lead-time la…

fe3052b

…bels Signed-off-by: Charlelie Laurent <[email protected]>

Moved new test from song_unet_pos_embd to song_unet_pos_lt_embd

b5992d5

Signed-off-by: Charlelie Laurent <[email protected]>

Updated CHANGELOG.md

ebaa43f

Signed-off-by: Charlelie Laurent <[email protected]>

Added safety check to force users to use SongUNetPosLtEmdb for lead-t…

2013cf5

…ime models Signed-off-by: Charlelie Laurent <[email protected]>

Deleted unecessary test

374c9f8

Signed-off-by: Charlelie Laurent <[email protected]>

CharlelieLrt self-requested a review August 8, 2025 19:02

CharlelieLrt approved these changes Aug 8, 2025

View reviewed changes

Fixed bug in positional_embedding_selector + changed samplers and tes…

2b6ac4e

…ts accordingly Signed-off-by: Charlelie Laurent <[email protected]>

CharlelieLrt merged commit 2e1cf65 into NVIDIA:main Aug 9, 2025
1 check passed

Improve lead time support for diffusion models #980

Improve lead time support for diffusion models #980

Uh oh!

Conversation

jleinonen commented Jun 17, 2025

PhysicsNeMo Pull Request

Description

Checklist

Dependencies

Uh oh!

jleinonen commented Jun 17, 2025

Uh oh!

jleinonen commented Jun 17, 2025

Uh oh!

jleinonen commented Jun 18, 2025

Uh oh!

jleinonen commented Jun 25, 2025

Uh oh!

jleinonen commented Jun 25, 2025

Uh oh!

CharlelieLrt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jleinonen commented Jul 10, 2025

Uh oh!

jleinonen commented Jul 14, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jleinonen commented Aug 7, 2025

Uh oh!

CharlelieLrt left a comment

Choose a reason for hiding this comment

Uh oh!

CharlelieLrt commented Aug 8, 2025

Uh oh!

CharlelieLrt commented Aug 9, 2025

Uh oh!

Uh oh!

Uh oh!