Skip to content

Conversation

@sunjingan
Copy link

**Pull Request **

feat(examples): Add CorrDiffSolar for high-resolution solar downscaling

Pull Request 正文 (描述)

Hi PhysicsNeMo Team,

This Pull Request introduces a new, comprehensive example for high-resolution solar irradiance downscaling using a conditional diffusion model, named CorrDiffSolar.

This end-to-end example demonstrates a real-world climate science application, showcasing how to prepare complex datasets, train a two-stage generative model, and perform large-scale inference using techniques like MultiDiffusion. It serves as a valuable use case for researchers interested in AI-based downscaling.

Key Contributions

  • Two-Stage Generative Model: Implements a Regression + Diffusion pipeline to upscale low-resolution (0.25°) ERA5 data to high-resolution (0.05°) solar radiation fields.
  • Comprehensive Data Preparation Scripts: Provides a full suite of scripts under prepare_solar_data/ to process raw ERA5 and Himawari-8 satellite data into a model-ready format, including DEM file generation.
  • Custom SolarDataset: Includes a specialized PyTorch Dataset (SolarDataset) for efficiently handling the paired low-res/high-res spatiotemporal data required for training.
  • Large-Scale Inference with MultiDiffusion: The inference logic utilizes a sliding-window approach (MultiDiffusion) to generate predictions for large domains that do not fit in GPU memory, making the model scalable.
  • Hydra-based Configuration: All training and inference workflows are managed via clear and modular Hydra configuration files.

Proposed Code Structure

All new code is self-contained within a new directory in the examples/ folder to ensure no disruption to the core library:

examples/solar_downscaling/
├── conf/
│   ├── config_training_multidiffsolar_regression.yaml
│   ├── config_training_multidiffsolar_diffusion.yaml
│   └── ... (inference configs)
├── helpers/
│   └── generate_helpers.py
├── datasets/
│   └── solar_dataset.py
├── prepare_solar_data/
│   ├── get_dem/
│   ├── prepare_era5.py
│   ├── prepare_h08_hourly.py
│   └── get_stats.py
├── train.py
├── generate.py
└── README_SOLAR.md  <-- Detailed documentation

Getting Started: A Quick Workflow Overview

A complete guide is available in examples/solar_downscaling/README.md. The high-level steps are:

  1. Environment Setup:

    • It is recommended to use the official PhysicsNeMo Docker container for a seamless setup.
  2. Data Preparation:

    • Download raw ERA5 and Himawari-8 data.
    • Run the scripts in prepare_solar_data/ to process the data, generate the DEM, and compute statistics. This will create the HRdata/, LRdata/, dem.nc, and stats.json files required for training.
  3. Model Training (Two Stages):

    • Stage 1 (Regression):
      torchrun --standalone --nnodes=1 --nproc_per_node=8 train.py --config-name=config_training_multidiffsolar_regression.yaml
    • Stage 2 (Diffusion):
      torchrun --standalone --nnodes=1 --nproc_per_node=8 train.py --config-name=config_training_multidiffsolar_diffusion.yaml
  4. Model Inference:

    • Run inference using either the regression model alone or the full two-stage pipeline with a command like:
      python generate.py --config-name=config_generate_multidiffsolar_wDiff.yaml

We believe this example will be a great addition to the PhysicsNeMo repository. Please let us know if you have any questions or feedback

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Nov 12, 2025

Greptile Overview

Greptile Summary

This PR adds a comprehensive solar irradiance downscaling example to PhysicsNeMo, implementing a two-stage generative model (Regression + Diffusion) with MultiDiffusion for large-scale inference.

Key Changes:

  • Added SolarDataset class for paired ERA5-Himawari8 data with sliding window sampling
  • Implemented MultiDiffusion class for memory-efficient sliding-window inference
  • Added generate_solar function to orchestrate regression and diffusion steps
  • Modified train.py and generate.py to handle solar-specific data format (includes window coordinates)
  • Provided complete data preparation scripts for ERA5 and Himawari-8 data
  • Added comprehensive documentation and example configurations

Critical Issue:

  • Syntax error in generate.py:414-415 - missing comma between function arguments will cause immediate runtime failure

Architecture:
The implementation follows a clean two-stage approach where the regression model provides a baseline prediction, and the diffusion model adds high-frequency details through residual learning. The MultiDiffusion technique enables generation of arbitrarily large high-resolution outputs by processing overlapping patches and averaging the results.

Confidence Score: 2/5

  • This PR cannot be merged safely due to a critical syntax error that will prevent code execution
  • The syntax error in generate.py (missing comma at line 414) is a blocking issue that will cause immediate failure. Once fixed, the implementation appears well-structured with proper MultiDiffusion logic, comprehensive documentation, and appropriate integration with the existing codebase. The changes are contained within the examples directory and follow established patterns.
  • examples/weather/corrdiff/generate.py - Must fix syntax error before merging

Important Files Changed

File Analysis

Filename Score Overview
examples/weather/corrdiff/generate.py 1/5 Added solar downscaling logic with generate_solar function call and MultiDiffusion support. Contains critical syntax error (missing comma) that will prevent execution.
examples/weather/corrdiff/train.py 4/5 Added solar flag detection and modified batch unpacking to handle windows parameter. Changes are minimal and correctly integrated.
examples/weather/corrdiff/datasets/solar_dataset.py 3/5 New SolarDataset class for ERA5-Himawari8 paired data. Implements sliding window sampling, solar zenith angle computation, and data normalization. Complex logic but generally well-structured.
examples/weather/corrdiff/helpers/generate_helpers.py 4/5 Added MultiDiffusion class for sliding-window inference and generate_solar function. Implementation follows standard diffusion sampling patterns with proper tensor stitching.

Sequence Diagram

sequenceDiagram
    participant User
    participant TrainScript as train.py
    participant GenScript as generate.py
    participant Dataset as SolarDataset
    participant RegNet as Regression Network
    participant DiffNet as Diffusion Network
    participant MultiDiff as MultiDiffusion

    Note over User,MultiDiff: Training Phase (Stage 1: Regression)
    User->>TrainScript: torchrun train.py --config regression
    TrainScript->>Dataset: Initialize SolarDataset
    Dataset-->>TrainScript: Return dataset with windows
    loop Training Loop
        TrainScript->>Dataset: Get batch (img_clean, img_lr, windows)
        Dataset-->>TrainScript: Return training data
        TrainScript->>RegNet: Forward pass
        RegNet-->>TrainScript: Predictions
        TrainScript->>TrainScript: Compute loss & update weights
    end
    TrainScript-->>User: Save regression checkpoint

    Note over User,MultiDiff: Training Phase (Stage 2: Diffusion)
    User->>TrainScript: torchrun train.py --config diffusion
    TrainScript->>RegNet: Load regression checkpoint
    TrainScript->>Dataset: Initialize SolarDataset
    loop Training Loop
        TrainScript->>Dataset: Get batch
        Dataset-->>TrainScript: Return data
        TrainScript->>RegNet: Get regression baseline
        RegNet-->>TrainScript: Base prediction
        TrainScript->>DiffNet: Forward pass (learn residual)
        DiffNet-->>TrainScript: Residual prediction
        TrainScript->>TrainScript: Compute residual loss
    end
    TrainScript-->>User: Save diffusion checkpoint

    Note over User,MultiDiff: Inference Phase
    User->>GenScript: python generate.py --config inference
    GenScript->>Dataset: Initialize SolarDataset (generating=True)
    Dataset-->>GenScript: Return dataset with sliding windows
    GenScript->>RegNet: Load regression checkpoint
    GenScript->>DiffNet: Load diffusion checkpoint
    loop For each time step
        GenScript->>Dataset: Get batch (img_tar, img_lr, windows)
        Dataset-->>GenScript: Return full-size data with windows
        GenScript->>MultiDiff: generate_solar(img_lr, windows, nets)
        loop For each window (Regression)
            MultiDiff->>RegNet: regression_step(img_lr_patch)
            RegNet-->>MultiDiff: Patch prediction
            MultiDiff->>MultiDiff: Stitch patches together
        end
        MultiDiff->>MultiDiff: Average overlapping regions
        alt Diffusion enabled
            loop For each ensemble seed
                loop For each diffusion step
                    loop For each window
                        MultiDiff->>DiffNet: Denoise patch
                        DiffNet-->>MultiDiff: Denoised patch
                        MultiDiff->>MultiDiff: Accumulate in value/count
                    end
                    MultiDiff->>MultiDiff: Average overlapping regions
                end
                MultiDiff->>MultiDiff: residual = diffusion_output
                MultiDiff->>MultiDiff: final = regression + residual
            end
        end
        MultiDiff-->>GenScript: Final high-res output
        GenScript->>GenScript: Save to NetCDF
    end
    GenScript-->>User: Output: corrdiff_output.nc
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

13 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines +414 to +415
image_tar_full=image_tar,
windows=windows,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syntax: missing comma after logger0 = logger0, causing syntax error

Suggested change
image_tar_full=image_tar,
windows=windows,
logger0 = logger0,
img_out_channels = img_out_channels,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants