feat: add optional boundary datastore support#635
Conversation
Add support for loading boundary forcing from a separate datastore, enabling LAM models to ingest boundary conditions from a different domain (e.g. ERA5 boundaries for a COSMO/DANRA interior). - NeuralLAMConfig accepts optional `datastore_boundary` field - load_config_and_datastore returns 3-tuple (config, datastore, datastore_boundary) - WeatherDataset loads, windows, and standardizes boundary forcing - __getitem__ returns 5-tuple (init_states, target_states, forcing, boundary, target_times) - New CLI args --num_past_boundary_steps / --num_future_boundary_steps - ForecasterModule.common_step unpacks boundary (not yet wired to forward) - 4 new boundary-specific tests, all 157 tests pass refs mllam#108 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Pass datastore_boundary through train_model.py into ForecasterModule. During --eval, plot_examples loads raw boundary forcing and overlays it underneath prediction/target panels via vis.plot_prediction. Add four boundary plotting tests using BoundaryDummyDatastore from PR mllam#635. Update README to document boundary plotting during evaluation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add MDP-based ERA5 boundary example at tests/datastore_examples/mdp/era5_1000hPa_danra_100m_winds/ with config.yaml, era5.datastore.yaml (WeatherBench2 64x32 equiangular), and danra.datastore.yaml (DANRA 100m winds interior). Add DATASTORES_BOUNDARY_EXAMPLES dict and init_datastore_boundary_example() to conftest.py for use in boundary integration tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
MDPDatastore.__init__ crashed with KeyError when loading a datastore that has only forcing+static (no state), e.g. ERA5 boundary data. Fix is_ensemble check to guard against missing state, and grid_shape_state to fall back to forcing/static categories. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Make _get_analysis_times fall back to forcing file patterns when no
state files exist, guard get_dataarray("state") against empty var_names,
and prevent empty feature list from matching state loading path.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Guard against missing state/static feature keys in the zarr, not just forcing. Boundary-only datastores (e.g. ERA5) may lack state_feature entirely. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Return len(grid_index) directly instead of computing from grid_shape_state, which is more robust for boundary-only datastores. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…eat/boundary-datastore
- Shrink the ERA5 boundary test dataset to 2022-03-30..2022-04-12 (was 1990-2022), add per-input lat/lon coord_ranges and enable mllam's convex-hull domain_cropping with include_interior_points=true. Stats computation drops from minutes to seconds and the cached zarr stays under 1 MB. - Stack [longitude, latitude] directly into grid_index in the era5 dim_mapping (per mllam's example.era5_cropped.yaml) so the original coord names survive for the convex-hull crop -- removes the need for any rename-preserve workaround in neural-lam. - Generalise the MDPDatastore units loop over self.spatial_coordinates with sensible defaults for x/y (m) and longitude/latitude/lon/lat (degrees_*) so ERA5-style geographic datastores work. - Register a pytest `slow` marker and a `--run-slow` CLI flag so the ERA5 boundary integration test (added in this PR) is skipped by default and can be run on demand. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Pass datastore_boundary through train_model.py into ForecasterModule. During --eval, plot_examples loads raw boundary forcing and overlays it underneath prediction/target panels via vis.plot_prediction. Add four boundary plotting tests using BoundaryDummyDatastore from PR mllam#635. Update README to document boundary plotting during evaluation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Drop state metadata after init and raise KeyError on `state` lookups so plotting/model code that accidentally queries state on a boundary fails loudly. Real ERA5-style boundary datastores expose only forcing fields, and the existing boundary tests (test_datasets.py) only ever access forcing on the boundary, so making the dummy state-less brings it closer to real boundary semantics without changing test behaviour.
Pass datastore_boundary through train_model.py into ForecasterModule. During --eval, plot_examples loads raw boundary forcing and overlays it underneath prediction/target panels via vis.plot_prediction. Add four boundary plotting tests using BoundaryDummyDatastore from PR mllam#635. Update README to document boundary plotting during evaluation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Asserts plot_prediction works against a boundary datastore with no state
category (forcing-only), exercising the get_xy("forcing") / get_lat_lon(
"forcing") path in vis.plot_on_axis. Pairs with the BoundaryDummyDatastore
state-less change in mllam#635.
Resolve conflicts from mllam#239 (normalize on GPU): - weather_dataset.py: drop the CPU standardization path (state/forcing/boundary stats setup, _compute_std_safe, in-__getitem__ scaling, and the standardize= plumbing in WeatherDataModule); keep the boundary-datastore feature and the 5-tuple sample (init, target, forcing, boundary, times). - models/module.py: on_after_batch_transfer now unpacks/returns the 5-tuple, standardizing state+forcing on-device and passing boundary through unchanged (boundary is not yet consumed by the forecaster on this branch). - tests/test_datasets.py: drop the dataset-standardization tests mllam#239 removed (incl. boundary standardization, now a GPU concern); keep the structural boundary tests without the removed standardize= kwarg. - tests/test_gpu_normalization.py: feed/expect the 5-tuple batch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
joeloskarsson
left a comment
There was a problem hiding this comment.
It's so great that you started the work on this 😄 I had a look through everything except the tests now. Had one major comment about the interior-boundary data alignment in the dataset, otherwise mostly small things.
| else: | ||
| raise ValueError("Dataset has no state, forcing, or static data") |
There was a problem hiding this comment.
Is this really possible now, then we have a fully empty datastore? I feel like we should raise an error way before this then (in the constructor).
There was a problem hiding this comment.
Fair point, fully empty was never meant to be valid. Added a check in __init__ that requires state or forcing, and simplified grid_shape_state to just pick between those two (dropped the static fallback and the defensive else). fixed in cf323a8.
| normalized here so the work runs on the accelerator. The boundary | ||
| forcing is passed through unchanged. |
There was a problem hiding this comment.
Is there a good reason to not do the standardization of boundary data here? This feels confusing and inconsistent to me. Best to do all standardization for the batch in the same place.
There was a problem hiding this comment.
As we discussed over lunch, I would argue that this is now part of the model side PR B, since we are using on_batch_transfer_end methods now and relieved the WeatherDataset of its standardization duties. But, we said to implement it here nonetheless so: ForecasterModule.__init__ now takes an optional datastore_boundary and registers boundary_mean/boundary_std buffers, and on_after_batch_transfer standardizes the boundary exactly like the interior forcing. With no boundary datastore the buffers stay None and the tensor passes through unchanged. Covered by the two new tests in tests/test_gpu_normalization.py. done in 9a9af31.
| ( | ||
| init_states, | ||
| target_states, | ||
| forcing_features, | ||
| boundary_features, | ||
| batch_times, | ||
| ) = batch |
There was a problem hiding this comment.
| ( | |
| init_states, | |
| target_states, | |
| forcing_features, | |
| boundary_features, | |
| batch_times, | |
| ) = batch | |
| ( | |
| init_states, | |
| target_states, | |
| forcing_features, | |
| boundary_features, | |
| batch_times, | |
| ) = batch | |
| # NOTE: For now we do not use the boundary features from here. This is yet | |
| # to be implemented on the model side. |
There was a problem hiding this comment.
| # init_states: (2, N_grid, d_features) | ||
| # target_states: (ar_steps, N_grid, d_features) | ||
| # forcing: (ar_steps, N_grid, d_windowed_forcing) | ||
| # boundary: (ar_steps, N_boundary_grid, d_windowed_boundary) |
There was a problem hiding this comment.
I am wondering if it is wise to return an empty tensor here, or if it would just be better for this to be None? I suppose that with None we would need a custom (but simple) collate function for the batching.
My reasoning is that if you for example forget to specify a boundary datastore then many things will still work and this will be a silent bug, potentially even the forward pass would work (in some scenario)? I can not see a case where we would not want to explicitly do things differently in a model depending on if the boundary forcing is present or not, and it being None would be a good signifier of this. Otherwise everything processing boundary would need to have the boundary datastore, to check if that is None.
I am not sure about this, and could probably be convinced either way. But would be happy to hear your thoughts.
There was a problem hiding this comment.
I went with an empty tensor to avoid a custom collate and keep __getitem__ shape-stable for code that just unpacks the 5-tuple. The silent-bug risk is real though, and I think the model wiring (PR B) is the right place to catch it: the model knows whether it expects boundary, so we can assert there that boundary.shape[-1] > 0 matches datastore_boundary is not None. Keeps the loader simple but still fails loudly when they get out of sync.
what do you think?
| if self.da_boundary_forcing is not None: | ||
| da_boundary_windowed = self._slice_forcing_time( | ||
| da_forcing=self.da_boundary_forcing, | ||
| idx=sample_idx, |
There was a problem hiding this comment.
I don't think this is the correct idx for the boundary. What about when the boundary has a different time step? What about when the boundary is a forecast? I feel like we used to have a lot of logic to find this alignment, that I can't see now, which makes me fear that we dropped something important.
But this might also be a matter of the scope of this PR, if you were intending this to restrict to reanalysis boundary with the same timestep? I would though prefer to make what we merge in as similar as possible to our contribution in the paper :)
There was a problem hiding this comment.
You were right, I messed up the partial port from the research branch, trying to reduce the scope of the PR.
Ported the time-based alignment from the research branch, so all four combinations work now: analysis/forecast interior crossed with analysis/forecast boundary. See 140caf5.
A few intentional deviations from the research branch:
- No
interior_subsample_step/boundary_subsample_step(buggy, orthogonal) - No
window_time_deltas,dynamic_time_deltas,time_sliceconcat in the boundary tensor (didn't help, can be added later) - Per-step the window still centers on the target time (the boundary condition for the interior at the predicted time), but only after the launch is fixed correctly on init. see a543ce9
There was a problem hiding this comment.
Could we also port over a version of https://github.com/joeloskarsson/neural-lam-dev/blob/research/tests/test_time_slicing.py? I remember this being very useful to figure out all the alignment between interior and boundary data.
There was a problem hiding this comment.
Good idea, that file is really useful. Extended the existing tests/test_time_slicing.py rather than adding a new one. SinglePointDummyDatastore now supports forecast mode, plus a new BoundaryOnlyDummyDatastore that mirrors a real boundary store (forcing-only, state access raises KeyError).
Added in cbc1c14.
…boundary-datastore
…pe_state Previously a datastore with no state, forcing or static would silently reach `grid_shape_state` and fail with a confusing fallback error. Now the constructor raises immediately if neither state nor forcing is present, and the fallback in `grid_shape_state` collapses to a simple `state if present else forcing` pick. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Boundary features are unpacked from the batch but the forecaster forward pass does not consume them yet; the model-side wiring lands in a follow-up PR (mllam#108). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…is/forecast modes Replace integer-idx boundary slicing with time-based nearest-neighbor (pad) lookup so the boundary datastore can have a different step length than the interior, and either side may be in analysis or forecast mode. - Add `get_time_step`, `check_time_overlap`, `crop_time_if_needed` helpers in `neural_lam.utils` (ported from the research branch in joeloskarsson/neural-lam-dev, with an extra guard against silent argmax-on-all-false cropping). - Refactor `WeatherDataset`: precompute the within-sample state step and any forecast lead-time step in __init__; run `crop_time_if_needed` + `check_time_overlap` against the boundary so the first/last samples never fall outside boundary coverage; replace `_slice_forcing_time` with `_window_forcing_in_time` for time-aligned windowing of cross-datastore boundary; preserve the original integer- idx fast path as `_window_same_forecast_by_idx` for same-datastore forecast forcing (npyfilesmeps has non-unique analysis_time so the pandas pad-lookup cannot be used there). - Window alignment matches the existing forcing convention (target time, i.e. `state_times[init_steps + i]`). - Split `test_boundary_dataset_length_unchanged` into a no-crop and a cropping case to document the new behaviour. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Port a slimmed-down version of the alignment tests from joeloskarsson/neural-lam-dev to exercise the new time-based boundary windowing in WeatherDataset. - Extend `SinglePointDummyDatastore` with forecast-mode support. - Add a boundary-only variant whose state lookup raises KeyError, to catch any path that accidentally asks the boundary for state. - `test_time_slicing_boundary_analysis`: parametrised over past/future window sizes, asserts exact boundary window values around each target state time. - `test_boundary_step_length_mismatch_supported`: 1h interior with a 6h boundary, verifies the pad-matched lookup. - `test_forecast_interior_with_analysis_boundary` and `test_analysis_interior_with_forecast_boundary`: the two mixed analysis/forecast combinations. - `test_check_time_overlap_insufficient_raises`: surface the cropping failure path with a clear error. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Boundary forcing was previously passed through unchanged in
`on_after_batch_transfer`, leaving the only normalization step on a
separate code path and inconsistent with how interior state/forcing are
handled. Wire it through the same on-device hook.
- `ForecasterModule.__init__` takes a new optional `datastore_boundary`
arg (excluded from `save_hyperparameters` alongside `datastore` and
`forecaster`, so it must be passed at `load_from_checkpoint` time).
- Register `boundary_mean` / `boundary_std` from
`datastore_boundary.get_standardization_dataarray("forcing")` when a
boundary datastore is provided; otherwise leave both as None.
- `on_after_batch_transfer` standardizes the boundary tensor the same
way it standardizes forcing: feature-major `(feature, window)`
per-feature stats are tiled once on the first batch and cached.
- Update the NOTE in `common_step` to reflect that boundary is now
standardized but still not consumed by the forecaster (mllam#108).
- Pass `datastore_boundary` from `train_model.main` to the module.
- New tests: `test_boundary_standardized_when_datastore_provided` and
`test_boundary_passthrough_when_no_boundary_datastore`.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The forecast-boundary path selected its analysis_time by pad-matching the first target time, which can pick a boundary forecast launched after model init - unavailable operationally. Anchor on the model init time (state_times[init_steps - 1], strictly before) instead, matching the research branch, and assert the per-step boundary valid time never runs ahead of the target. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
|
Hi @sadamov, |
|
@observingClouds yes please! Feel free to leave your reviews. If there are major pieces missing, you should have push access to my PR-branch here and after a quick ping you can also push directly. |
Resolved conflicts: - neural_lam/datastore/mdp.py: kept the state-or-forcing-required validation while taking main's local _ds variable pattern (refactored upstream). - neural_lam/train_model.py: moved --num_past_boundary_steps and --num_future_boundary_steps to use data_group.add_argument, matching the argument-groups refactor from mllam#641. - neural_lam/weather_dataset.py: kept mllam#635's renamed _window_same_forecast_by_idx and shared_kwargs setup pattern; added type hints to match the post-mllam#631 type-hint sweep. Updated __getitem__ and __iter__ return types from 4-tuple to 5-tuple to reflect the new boundary tensor. Typed shared_kwargs as dict[str, Any] to satisfy mypy on the **kwargs splat. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Resolved conflicts: - neural_lam/train_model.py: moved --num_past_boundary_steps and --num_future_boundary_steps to use data_group.add_argument, matching the argument-groups refactor from mllam#641 (same as mllam#635 resolution). - neural_lam/weather_dataset.py: kept mllam#636's _slice_forcing_time signature with the extra num_past_steps/num_future_steps params, added type hints from main. Typed shared_kwargs as dict[str, Any] to satisfy mypy on the **kwargs splat. Updated _build_item_dataarrays, __getitem__, and __iter__ return types from 4-tuple to 5-tuple to reflect the new boundary tensor. - neural_lam/vis.py: kept mllam#636's boundary_da / boundary_datastore / boundary_margin_degrees args on plot_on_axis, plot_prediction, and plot_spatial_error. crop_to_interior is preserved as a deprecated parameter on plot_prediction and plot_spatial_error (already handled with a deprecation warning earlier on this branch). Type hints from main's post-mllam#631 sweep are applied throughout. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Matches the CI invocation introduced in mllam#651 so this branch's eventual ERA5 boundary integration test (and any other @pytest.mark.slow on this branch) is exercised by CI, not just by local --run-slow invocations. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@sadamov I added #652 to say how I would generalise the config. I am not saying that we need to use that config layout with this PR, but you could maybe have a look at that PR and see what you think? If you do like to use it then we could maybe consider using a structure in this PR that could remain relatively unchanged down the line. |
Per the simplification in mllam#651, the custom --run-slow flag is being dropped in favour of relying on pytest's native -m marker selection. Remove the corresponding pytest_addoption + pytest_collection_modifyitems from conftest and the --run-slow flag from CI on this branch too, so this PR doesn't re-introduce conflicts when mllam#651 lands. The slow marker registration in pyproject.toml stays, ready for use on any future @pytest.mark.slow test on this branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… constructor only) Pre-emptively land the public CONFIG-LAYER shape proposed in mllam#652 on top of the boundary-datastore work in mllam#635 so the model-side adapter (Joel's follow-up) doesn't have to break the schema again later. The return type of `WeatherDataset.__getitem__` stays as a tuple; the per-sample boundary tensor is dropped from the public output entirely (it's loaded internally for the data infrastructure that mllam#635 introduced, but not surfaced) because the model isn't consuming it today and the multi-source consumption path under mllam#652 will reintroduce it via a model-side `ForecastBatch` with per-source dicts. Config schema (neural_lam/config.py): - Replace `datastore` + `datastore_boundary` top-level keys with a single `datastores: Dict[str, DatastoreSelection]` mapping. The dict key becomes the canonical source name used throughout the pipeline. - `DatastoreSelection` grows optional `inputs:` and `outputs:` per- category variable include-lists (parsed but not honoured at runtime yet; a follow-up filters tensors by them once Joel's model-side consumption lands and decides the shape). - Validate at config load: raise InvalidConfigError when two datastores declare the same variable as an output, pointing at mdp's `dim_mapping.name_format` or `xr.Dataset.assign_coords` on the existing zarr's small `{category}_feature` coord. - `load_config_and_datastore` returns `(config, Dict[str, BaseDatastore])` rather than the legacy `(config, interior, boundary)` triple. WeatherDataset (neural_lam/weather_dataset.py): - Constructor takes `(datastores, selections, ...)` dicts directly. Internally resolves the single interior + optional boundary pair for its existing slicing/windowing logic (which is unchanged). - Boundary loading + windowing infrastructure stays intact - the boundary datastore is still loaded, the boundary forcing is still built per sample - but the per-sample tensor is intentionally not surfaced in __getitem__. - __getitem__ returns the pre-mllam#635 4-tuple `(init_states, target_states, forcing, target_times)` with an explicit TODO(mllam#652) marker at the construction site pointing Joel at where the model-side ForecastBatch will plug in. - `create_dataarray_from_tensor` refactored to expose a `build_dataarray_from_tensor` staticmethod so model.py can build DataArrays without instantiating a full WeatherDataset under the new constructor signature. ForecasterModule (neural_lam/models/module.py): - `on_after_batch_transfer` reverts to the pre-mllam#635 4-tuple unpack. - `common_step` reverts to the pre-mllam#635 4-tuple unpack. - `plot_examples` updates `time = batch[4]` to `time = batch[3]`. - Boundary normalisation buffers and tiled caches removed; the boundary datastore reference (`self.datastore_boundary`) is still held so that the future model-side adapter (mllam#652 follow-up) can re-introduce normalisation alongside the per-source dict consumption. - `_create_dataarray_from_tensor` uses the new `WeatherDataset.build_dataarray_from_tensor` staticmethod. Production call site (neural_lam/train_model.py): - Unpacks the new `(config, datastores)` return shape. - Uses `_resolve_datastore_roles` from weather_dataset to pick out the interior + boundary for the legacy ForecasterModule constructor. Example YAMLs (tests/datastore_examples/): - Single-source danra: `datastores: {danra: ...}`. - danra + era5 boundary: `datastores: {interior: ..., boundary: ...}` with explicit `outputs: {state: }` on interior so the resolver knows which one is the prognostic source. Intentionally deferred to mllam#652 follow-up: - Surfacing boundary in the per-sample output via a model-side ForecastBatch with per-source dicts. - Honouring `inputs:` / `outputs:` include-lists at runtime. - Diagnostic outputs (parsed in schema, not yet wired through). Refs mllam#635, refs mllam#652. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the single `datastore:` top-level config field with a `datastores:` mapping keyed by user-chosen names. Each entry is a DatastoreSelection with optional per-category `inputs` / `outputs` declarations; one datastore must declare outputs (the interior / prognostic source) and zero or more may contribute input-only sources that are reserved for the model-side multi-source consumption (the mllam#652 follow-up). WeatherDataset and WeatherDataModule take `datastores` and `selections` dicts; their per-sample return shape and the model unpack are unchanged from current main, so this is a config + data-loader constructor refactor only. Internally the dataset still operates on the interior datastore alone. load_config_and_datastore returns (config, Dict[str, BaseDatastore]). A config-time validator rejects two datastores declaring the same output variable name, with an error message pointing at mdp's `dim_mapping.name_format` and `xr.Dataset.assign_coords` as the two ways to disambiguate. Other callers updated: - train_model.py resolves interior + boundary roles for the legacy single-source model side. - create_graph.py and plot_graph.py resolve the interior datastore via `_resolve_datastore_roles` instead of the old 2-tuple. - module.py refactors `_create_dataarray_from_tensor` to use a new `WeatherDataset.build_dataarray_from_tensor` staticmethod so the model doesn't need to instantiate a full WeatherDataset with the new dict signature. This PR is an alternative to mllam#635: it adopts the public schema proposed in mllam#652 without bringing in mllam#635's internal boundary loading. Boundary forcing, multi-source inputs and diagnostic outputs land via the mllam#652 model-side follow-up. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
6d237d7 to
78a4d52
Compare
Replace the single `datastore:` top-level config field with a `datastores:` mapping keyed by user-chosen names. Each entry is a DatastoreSelection with optional per-category `inputs` / `outputs` declarations; one datastore must declare outputs (the interior / prognostic source) and zero or more may contribute input-only sources that are reserved for the model-side multi-source consumption (the mllam#652 follow-up). WeatherDataset and WeatherDataModule take `datastores` and `selections` dicts; their per-sample return shape and the model unpack are unchanged from current main, so this is a config + data-loader constructor refactor only. Internally the dataset still operates on the interior datastore alone. load_config_and_datastore returns (config, Dict[str, BaseDatastore]). A config-time validator rejects two datastores declaring the same output variable name, with an error message pointing at mdp's `dim_mapping.name_format` and `xr.Dataset.assign_coords` as the two ways to disambiguate. Other callers updated: - train_model.py resolves interior + boundary roles for the legacy single-source model side. - create_graph.py and plot_graph.py resolve the interior datastore via `_resolve_datastore_roles` instead of the old 2-tuple. - module.py refactors `_create_dataarray_from_tensor` to use a new `WeatherDataset.build_dataarray_from_tensor` staticmethod so the model doesn't need to instantiate a full WeatherDataset with the new dict signature. This PR is an alternative to mllam#635: it adopts the public schema proposed in mllam#652 without bringing in mllam#635's internal boundary loading. Boundary forcing, multi-source inputs and diagnostic outputs land via the mllam#652 model-side follow-up. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
Register a `slow` marker in `pyproject.toml` and a matching `--run-slow` CLI flag in `tests/conftest.py`. Tests carrying `@pytest.mark.slow` are skipped by default and can be opted into via `pytest --run-slow`. Mark `test_training` (parametrised over all datastores) and `test_training_output_std` as slow because they run a real `trainer.fit` loop on the MDP / npyfilesmeps datastores and take minutes per parametrisation. The fast unit tests in the suite (`test_all_gather_cat_*`, datastore tests against the dummy fixture, etc.) continue to run on every `pytest` invocation. Motivated by #635 which already needed this mechanism for the ERA5 boundary integration test. Landing the infra separately so it's reusable across the suite (e.g. the `test_state_only_datastore_*` training test on #231 and future trainer.fit-based regressions). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Describe your changes
Add optional boundary datastore support to
WeatherDatasetandWeatherDataModule, enabling LAM models to ingest boundary forcing from a separate domain (e.g. ERA5 for a COSMO/DANRA interior).NeuralLAMConfigaccepts an optionaldatastore_boundaryfieldload_config_and_datastorereturns a 3-tuple(config, datastore, datastore_boundary)WeatherDataset.__getitem__returns a 5-tuple(init_states, target_states, forcing, boundary, target_times)where the boundary tensor is empty (last dim 0) when no boundary datastore is configured--num_past_boundary_steps/--num_future_boundary_stepscontrol the boundary forcing window sizeForecasterModule.common_stepunpacks the boundary tensor but does not yet wire it into the forward pass (model-side integration is planned as a separate PR)MDPDatastoreandNpyFilesDatastoreMEPSboth handle boundary-only configs (forcing + static, no state variables) gracefullytests/datastore_examples/mdp/era5_1000hPa_danra_100m_winds/(WeatherBench2 64x32 equiangular grid as boundary for DANRA interior), withinit_datastore_boundary_example()fixture inconftest.pyThis is PR A in the boundary datastore plan outlined in #108. PR B (model-side boundary handling) and PR C (#636, boundary plotting) will follow.
Issue Link
refs #108
Type of change
Note on breaking change:
WeatherDataset.__getitem__now returns a 5-tuple instead of 4-tuple, andload_config_and_datastorereturns a 3-tuple instead of 2-tuple. All callers in the repo have been updated.Checklist before requesting a review
pullwith--rebaseoption if possible).Checklist for reviewers
Each PR comes with its own improvements and flaws. The reviewer should check the following:
Author checklist after completed review
reflecting type of change (add section where missing):
Checklist for assignee
refs #138