U/alxmrs/experiments/kr2#681
Conversation
a736b1a to
c159eb8
Compare
| @@ -28,26 +29,34 @@ def patch_from( | |||
|
|
|||
|
|
|||
| class PerceiverEncoder(nn.Module): | |||
There was a problem hiding this comment.
The changes in this branch are in service of this encoder update.
|
@codex may I have your review? |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 2a89013570
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| prognostic = self.maybe_add_3d_coordinates( | ||
| prognostic, ctx.output_resolution_cpu | ||
| ) | ||
|
|
||
| fts = self.encoder(prognostic, boundary, ctx.output_resolution_cpu) |
There was a problem hiding this comment.
Use input-resolution coordinates for FOMO prognostic path
When src and dst resolutions differ in mix schedule training, TrainData carries prognostic tensors on ctx.input_resolution_cpu (see TorchTrainDataset.ctx), but this code feeds ctx.output_resolution_cpu into the prognostic path. That makes spatial metadata inconsistent with the tensor shape: Concat3dCoordinates can fail with a size mismatch if 3D coordinates are enabled, and even without 3D coordinates the encoder’s positional encoding (pos_scale_enc) is computed from output-grid lat/lon with input-grid patch sizes, producing a token count mismatch at the x + pos_encoding add. As written, cross-resolution mix runs can crash on the first forward pass.
Useful? React with 👍 / 👎.
|
@codex may I have your review? |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d1c7d2e68d
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
@codex may I have another review? |
|
Codex Review: Didn't find any major issues. 🚀 ℹ️ About Codex in GitHubCodex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback". |
944c435 to
022914a
Compare
| # so the encoder uses the correct resolution. | ||
| ctx = dataclasses.replace( | ||
| ctx, input_resolution_cpu=ctx.output_resolution_cpu | ||
| ) |
There was a problem hiding this comment.
This addresses the TODO on the left.
| class Stepper: | ||
| def __init__(self): | ||
| pass | ||
| def train_batch( |
There was a problem hiding this comment.
The Stepper class was really three functions in a trenchcoat. So, I extracted them.
…oading to split the input.
…boundary tensors after projecting the boundary.
… Encoder gets cross attention hparams Refactor: extracted function to created patches.
… catch a bug. Added fourier pos encoding to the boundary layer (new module in augment_input.py).
…d test to confirm the need for the fix.
58f278e to
8f7ffa2
Compare
This PR makes four contributions towards implementing O1KR2 (#615):
GridContextinput resolution after step 1. In this new scheme, step 0 does either downscaling or upscaling, and step 1+ updates the grid context to make the input resolution the output resolution. This is needed during "mix" schedule multiscale training, since step 1 and after the prognostic is fed back to the previous decoder step at the output resolution. This does not affect "match" schedule multiscale training, because both resolutions are already equal.test_fomo_cross_resolutionset of integration tests. These vet that single step cross resolution (prog/boundary), single step mix schedule, and two step AR mixed schedule training of the FOMO model all work as expected.