feat: add optional boundary datastore support by sadamov · Pull Request #635 · mllam/neural-lam

sadamov · 2026-05-11T14:01:20Z

Describe your changes

Add optional boundary datastore support to WeatherDataset and WeatherDataModule, enabling LAM models to ingest boundary forcing from a separate domain (e.g. ERA5 for a COSMO/DANRA interior).

NeuralLAMConfig accepts an optional datastore_boundary field
load_config_and_datastore returns a 3-tuple (config, datastore, datastore_boundary)
WeatherDataset.__getitem__ returns a 5-tuple (init_states, target_states, forcing, boundary, target_times) where the boundary tensor is empty (last dim 0) when no boundary datastore is configured
New CLI args --num_past_boundary_steps / --num_future_boundary_steps control the boundary forcing window size
ForecasterModule.common_step unpacks the boundary tensor but does not yet wire it into the forward pass (model-side integration is planned as a separate PR)
Boundary datastore can be any registered datastore type (mdp, npyfilesmeps, etc.)
MDPDatastore and NpyFilesDatastoreMEPS both handle boundary-only configs (forcing + static, no state variables) gracefully
ERA5 boundary test configs added at tests/datastore_examples/mdp/era5_1000hPa_danra_100m_winds/ (WeatherBench2 64x32 equiangular grid as boundary for DANRA interior), with init_datastore_boundary_example() fixture in conftest.py

This is PR A in the boundary datastore plan outlined in #108. PR B (model-side boundary handling) and PR C (#636, boundary plotting) will follow.

Issue Link

refs #108

Type of change

🐛 Bug fix (non-breaking change that fixes an issue)
✨ New feature (non-breaking change that adds functionality)
💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
📖 Documentation (Addition or improvements to documentation)

Note on breaking change: WeatherDataset.__getitem__ now returns a 5-tuple instead of 4-tuple, and load_config_and_datastore returns a 3-tuple instead of 2-tuple. All callers in the repo have been updated.

Checklist before requesting a review

My branch is up-to-date with the target branch - if not update your fork with the changes from the target branch (use pull with --rebase option if possible).
I have performed a self-review of my code
For any new/modified functions/classes I have added docstrings that clearly describe its purpose, expected inputs and returned values
I have placed in-line comments to clarify the intent of any hard-to-understand passages of my code
I have updated the README to cover introduced code changes
I have added tests that prove my fix is effective or that my feature works
I have given the PR a name that clearly describes the change, written in imperative form (context).
I have requested a reviewer and an assignee (assignee is responsible for merging). This applies only if you have write access to the repo, otherwise feel free to tag a maintainer to add a reviewer and assignee.

Checklist for reviewers

Each PR comes with its own improvements and flaws. The reviewer should check the following:

the code is readable
the code is well tested
the code is documented (including return types and parameters)
the code is easy to maintain

Author checklist after completed review

I have added a line to the CHANGELOG describing this change, in a section
reflecting type of change (add section where missing):
- added: when you have added new functionality
- changed: when default behaviour of the code has been changed
- fixes: when your contribution fixes a bug
- maintenance: when your contribution is relates to repo maintenance, e.g. CI/CD or documentation

Checklist for assignee

PR is up to date with the base branch
the tests pass
(if the PR is not just maintenance/bugfix) the PR is assigned to the next milestone. If it is not, propose it for a future milestone.
author has added an entry to the changelog (and designated the change as added, changed, fixed or maintenance)
Once the PR is ready to be merged, squash commits and merge the PR.

refs #138

Add support for loading boundary forcing from a separate datastore, enabling LAM models to ingest boundary conditions from a different domain (e.g. ERA5 boundaries for a COSMO/DANRA interior). - NeuralLAMConfig accepts optional `datastore_boundary` field - load_config_and_datastore returns 3-tuple (config, datastore, datastore_boundary) - WeatherDataset loads, windows, and standardizes boundary forcing - __getitem__ returns 5-tuple (init_states, target_states, forcing, boundary, target_times) - New CLI args --num_past_boundary_steps / --num_future_boundary_steps - ForecasterModule.common_step unpacks boundary (not yet wired to forward) - 4 new boundary-specific tests, all 157 tests pass refs mllam#108 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Pass datastore_boundary through train_model.py into ForecasterModule. During --eval, plot_examples loads raw boundary forcing and overlays it underneath prediction/target panels via vis.plot_prediction. Add four boundary plotting tests using BoundaryDummyDatastore from PR mllam#635. Update README to document boundary plotting during evaluation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Add MDP-based ERA5 boundary example at tests/datastore_examples/mdp/era5_1000hPa_danra_100m_winds/ with config.yaml, era5.datastore.yaml (WeatherBench2 64x32 equiangular), and danra.datastore.yaml (DANRA 100m winds interior). Add DATASTORES_BOUNDARY_EXAMPLES dict and init_datastore_boundary_example() to conftest.py for use in boundary integration tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

MDPDatastore.__init__ crashed with KeyError when loading a datastore that has only forcing+static (no state), e.g. ERA5 boundary data. Fix is_ensemble check to guard against missing state, and grid_shape_state to fall back to forcing/static categories. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Make _get_analysis_times fall back to forcing file patterns when no state files exist, guard get_dataarray("state") against empty var_names, and prevent empty feature list from matching state loading path. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Guard against missing state/static feature keys in the zarr, not just forcing. Boundary-only datastores (e.g. ERA5) may lack state_feature entirely. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Return len(grid_index) directly instead of computing from grid_shape_state, which is more robust for boundary-only datastores. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…eat/boundary-datastore

- Shrink the ERA5 boundary test dataset to 2022-03-30..2022-04-12 (was 1990-2022), add per-input lat/lon coord_ranges and enable mllam's convex-hull domain_cropping with include_interior_points=true. Stats computation drops from minutes to seconds and the cached zarr stays under 1 MB. - Stack [longitude, latitude] directly into grid_index in the era5 dim_mapping (per mllam's example.era5_cropped.yaml) so the original coord names survive for the convex-hull crop -- removes the need for any rename-preserve workaround in neural-lam. - Generalise the MDPDatastore units loop over self.spatial_coordinates with sensible defaults for x/y (m) and longitude/latitude/lon/lat (degrees_*) so ERA5-style geographic datastores work. - Register a pytest `slow` marker and a `--run-slow` CLI flag so the ERA5 boundary integration test (added in this PR) is skipped by default and can be run on demand. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Pass datastore_boundary through train_model.py into ForecasterModule. During --eval, plot_examples loads raw boundary forcing and overlays it underneath prediction/target panels via vis.plot_prediction. Add four boundary plotting tests using BoundaryDummyDatastore from PR mllam#635. Update README to document boundary plotting during evaluation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Drop state metadata after init and raise KeyError on `state` lookups so plotting/model code that accidentally queries state on a boundary fails loudly. Real ERA5-style boundary datastores expose only forcing fields, and the existing boundary tests (test_datasets.py) only ever access forcing on the boundary, so making the dummy state-less brings it closer to real boundary semantics without changing test behaviour.

Pass datastore_boundary through train_model.py into ForecasterModule. During --eval, plot_examples loads raw boundary forcing and overlays it underneath prediction/target panels via vis.plot_prediction. Add four boundary plotting tests using BoundaryDummyDatastore from PR mllam#635. Update README to document boundary plotting during evaluation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Asserts plot_prediction works against a boundary datastore with no state category (forcing-only), exercising the get_xy("forcing") / get_lat_lon( "forcing") path in vis.plot_on_axis. Pairs with the BoundaryDummyDatastore state-less change in mllam#635.

Resolve conflicts from mllam#239 (normalize on GPU): - weather_dataset.py: drop the CPU standardization path (state/forcing/boundary stats setup, _compute_std_safe, in-__getitem__ scaling, and the standardize= plumbing in WeatherDataModule); keep the boundary-datastore feature and the 5-tuple sample (init, target, forcing, boundary, times). - models/module.py: on_after_batch_transfer now unpacks/returns the 5-tuple, standardizing state+forcing on-device and passing boundary through unchanged (boundary is not yet consumed by the forecaster on this branch). - tests/test_datasets.py: drop the dataset-standardization tests mllam#239 removed (incl. boundary standardization, now a GPU concern); keep the structural boundary tests without the removed standardize= kwarg. - tests/test_gpu_normalization.py: feed/expect the 5-tuple batch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

joeloskarsson

It's so great that you started the work on this 😄 I had a look through everything except the tests now. Had one major comment about the interior-boundary data alignment in the dataset, otherwise mostly small things.

joeloskarsson · 2026-05-22T15:31:18Z

+        else:
+            raise ValueError("Dataset has no state, forcing, or static data")


Is this really possible now, then we have a fully empty datastore? I feel like we should raise an error way before this then (in the constructor).

Fair point, fully empty was never meant to be valid. Added a check in __init__ that requires state or forcing, and simplified grid_shape_state to just pick between those two (dropped the static fallback and the defensive else). fixed in cf323a8.

joeloskarsson · 2026-05-22T15:34:04Z

+        normalized here so the work runs on the accelerator. The boundary
+        forcing is passed through unchanged.


Is there a good reason to not do the standardization of boundary data here? This feels confusing and inconsistent to me. Best to do all standardization for the batch in the same place.

As we discussed over lunch, I would argue that this is now part of the model side PR B, since we are using on_batch_transfer_end methods now and relieved the WeatherDataset of its standardization duties. But, we said to implement it here nonetheless so: ForecasterModule.__init__ now takes an optional datastore_boundary and registers boundary_mean/boundary_std buffers, and on_after_batch_transfer standardizes the boundary exactly like the interior forcing. With no boundary datastore the buffers stay None and the tensor passes through unchanged. Covered by the two new tests in tests/test_gpu_normalization.py. done in 9a9af31.

joeloskarsson · 2026-05-22T15:45:30Z

+        (
+            init_states,
+            target_states,
+            forcing_features,
+            boundary_features,
+            batch_times,
+        ) = batch


Suggested change

(

init_states,

target_states,

forcing_features,

boundary_features,

batch_times,

) = batch

(

init_states,

target_states,

forcing_features,

boundary_features,

batch_times,

) = batch

# NOTE: For now we do not use the boundary features from here. This is yet

# to be implemented on the model side.

Applied with a small reword (0ec56f5). Boundary is now standardized in on_after_batch_transfer (comment #2) but still not consumed by the forward pass, so in 9a9af31 I extended the NOTE to point at the standardization and reference #108 for the model-side wiring.

joeloskarsson · 2026-05-22T15:53:07Z

        # init_states: (2, N_grid, d_features)
        # target_states: (ar_steps, N_grid, d_features)
        # forcing: (ar_steps, N_grid, d_windowed_forcing)
+        # boundary: (ar_steps, N_boundary_grid, d_windowed_boundary)


I am wondering if it is wise to return an empty tensor here, or if it would just be better for this to be None? I suppose that with None we would need a custom (but simple) collate function for the batching.
My reasoning is that if you for example forget to specify a boundary datastore then many things will still work and this will be a silent bug, potentially even the forward pass would work (in some scenario)? I can not see a case where we would not want to explicitly do things differently in a model depending on if the boundary forcing is present or not, and it being None would be a good signifier of this. Otherwise everything processing boundary would need to have the boundary datastore, to check if that is None.
I am not sure about this, and could probably be convinced either way. But would be happy to hear your thoughts.

I went with an empty tensor to avoid a custom collate and keep __getitem__ shape-stable for code that just unpacks the 5-tuple. The silent-bug risk is real though, and I think the model wiring (PR B) is the right place to catch it: the model knows whether it expects boundary, so we can assert there that boundary.shape[-1] > 0 matches datastore_boundary is not None. Keeps the loader simple but still fails loudly when they get out of sync.
what do you think?

joeloskarsson · 2026-05-22T16:01:42Z

+        if self.da_boundary_forcing is not None:
+            da_boundary_windowed = self._slice_forcing_time(
+                da_forcing=self.da_boundary_forcing,
+                idx=sample_idx,


I don't think this is the correct idx for the boundary. What about when the boundary has a different time step? What about when the boundary is a forecast? I feel like we used to have a lot of logic to find this alignment, that I can't see now, which makes me fear that we dropped something important.

But this might also be a matter of the scope of this PR, if you were intending this to restrict to reanalysis boundary with the same timestep? I would though prefer to make what we merge in as similar as possible to our contribution in the paper :)

You were right, I messed up the partial port from the research branch, trying to reduce the scope of the PR.
Ported the time-based alignment from the research branch, so all four combinations work now: analysis/forecast interior crossed with analysis/forecast boundary. See 140caf5.

A few intentional deviations from the research branch:

No interior_subsample_step / boundary_subsample_step (buggy, orthogonal)

No window_time_deltas, dynamic_time_deltas, time_slice concat in the boundary tensor (didn't help, can be added later)

Per-step the window still centers on the target time (the boundary condition for the interior at the predicted time), but only after the launch is fixed correctly on init. see a543ce9

joeloskarsson · 2026-05-22T16:05:52Z

Could we also port over a version of https://github.com/joeloskarsson/neural-lam-dev/blob/research/tests/test_time_slicing.py? I remember this being very useful to figure out all the alignment between interior and boundary data.

Good idea, that file is really useful. Extended the existing tests/test_time_slicing.py rather than adding a new one. SinglePointDummyDatastore now supports forecast mode, plus a new BoundaryOnlyDummyDatastore that mirrors a real boundary store (forcing-only, state access raises KeyError).
Added in cbc1c14.

…boundary-datastore

…pe_state Previously a datastore with no state, forcing or static would silently reach `grid_shape_state` and fail with a confusing fallback error. Now the constructor raises immediately if neither state nor forcing is present, and the fallback in `grid_shape_state` collapses to a simple `state if present else forcing` pick. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Boundary features are unpacked from the batch but the forecaster forward pass does not consume them yet; the model-side wiring lands in a follow-up PR (mllam#108). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…is/forecast modes Replace integer-idx boundary slicing with time-based nearest-neighbor (pad) lookup so the boundary datastore can have a different step length than the interior, and either side may be in analysis or forecast mode. - Add `get_time_step`, `check_time_overlap`, `crop_time_if_needed` helpers in `neural_lam.utils` (ported from the research branch in joeloskarsson/neural-lam-dev, with an extra guard against silent argmax-on-all-false cropping). - Refactor `WeatherDataset`: precompute the within-sample state step and any forecast lead-time step in __init__; run `crop_time_if_needed` + `check_time_overlap` against the boundary so the first/last samples never fall outside boundary coverage; replace `_slice_forcing_time` with `_window_forcing_in_time` for time-aligned windowing of cross-datastore boundary; preserve the original integer- idx fast path as `_window_same_forecast_by_idx` for same-datastore forecast forcing (npyfilesmeps has non-unique analysis_time so the pandas pad-lookup cannot be used there). - Window alignment matches the existing forcing convention (target time, i.e. `state_times[init_steps + i]`). - Split `test_boundary_dataset_length_unchanged` into a no-crop and a cropping case to document the new behaviour. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Port a slimmed-down version of the alignment tests from joeloskarsson/neural-lam-dev to exercise the new time-based boundary windowing in WeatherDataset. - Extend `SinglePointDummyDatastore` with forecast-mode support. - Add a boundary-only variant whose state lookup raises KeyError, to catch any path that accidentally asks the boundary for state. - `test_time_slicing_boundary_analysis`: parametrised over past/future window sizes, asserts exact boundary window values around each target state time. - `test_boundary_step_length_mismatch_supported`: 1h interior with a 6h boundary, verifies the pad-matched lookup. - `test_forecast_interior_with_analysis_boundary` and `test_analysis_interior_with_forecast_boundary`: the two mixed analysis/forecast combinations. - `test_check_time_overlap_insufficient_raises`: surface the cropping failure path with a clear error. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Boundary forcing was previously passed through unchanged in `on_after_batch_transfer`, leaving the only normalization step on a separate code path and inconsistent with how interior state/forcing are handled. Wire it through the same on-device hook. - `ForecasterModule.__init__` takes a new optional `datastore_boundary` arg (excluded from `save_hyperparameters` alongside `datastore` and `forecaster`, so it must be passed at `load_from_checkpoint` time). - Register `boundary_mean` / `boundary_std` from `datastore_boundary.get_standardization_dataarray("forcing")` when a boundary datastore is provided; otherwise leave both as None. - `on_after_batch_transfer` standardizes the boundary tensor the same way it standardizes forcing: feature-major `(feature, window)` per-feature stats are tiled once on the first batch and cached. - Update the NOTE in `common_step` to reflect that boundary is now standardized but still not consumed by the forecaster (mllam#108). - Pass `datastore_boundary` from `train_model.main` to the module. - New tests: `test_boundary_standardized_when_datastore_provided` and `test_boundary_passthrough_when_no_boundary_datastore`. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

The forecast-boundary path selected its analysis_time by pad-matching the first target time, which can pick a boundary forecast launched after model init - unavailable operationally. Anchor on the model init time (state_times[init_steps - 1], strictly before) instead, matching the research branch, and assert the per-step boundary valid time never runs ahead of the target. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

observingClouds · 2026-06-02T05:46:27Z

Hi @sadamov,
Thanks for starting and advancing this feature integration. As this is something that we at DMI need to make our operational AI model, let me and @SimonKamuk know if we can help to bring this over the finish line. We might have some comments too 😛

sadamov · 2026-06-02T07:22:33Z

@observingClouds yes please! Feel free to leave your reviews. If there are major pieces missing, you should have push access to my PR-branch here and after a quick ping you can also push directly.

Resolved conflicts: - neural_lam/datastore/mdp.py: kept the state-or-forcing-required validation while taking main's local _ds variable pattern (refactored upstream). - neural_lam/train_model.py: moved --num_past_boundary_steps and --num_future_boundary_steps to use data_group.add_argument, matching the argument-groups refactor from mllam#641. - neural_lam/weather_dataset.py: kept mllam#635's renamed _window_same_forecast_by_idx and shared_kwargs setup pattern; added type hints to match the post-mllam#631 type-hint sweep. Updated __getitem__ and __iter__ return types from 4-tuple to 5-tuple to reflect the new boundary tensor. Typed shared_kwargs as dict[str, Any] to satisfy mypy on the **kwargs splat. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Resolved conflicts: - neural_lam/train_model.py: moved --num_past_boundary_steps and --num_future_boundary_steps to use data_group.add_argument, matching the argument-groups refactor from mllam#641 (same as mllam#635 resolution). - neural_lam/weather_dataset.py: kept mllam#636's _slice_forcing_time signature with the extra num_past_steps/num_future_steps params, added type hints from main. Typed shared_kwargs as dict[str, Any] to satisfy mypy on the **kwargs splat. Updated _build_item_dataarrays, __getitem__, and __iter__ return types from 4-tuple to 5-tuple to reflect the new boundary tensor. - neural_lam/vis.py: kept mllam#636's boundary_da / boundary_datastore / boundary_margin_degrees args on plot_on_axis, plot_prediction, and plot_spatial_error. crop_to_interior is preserved as a deprecated parameter on plot_prediction and plot_spatial_error (already handled with a deprecation warning earlier on this branch). Type hints from main's post-mllam#631 sweep are applied throughout. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Matches the CI invocation introduced in mllam#651 so this branch's eventual ERA5 boundary integration test (and any other @pytest.mark.slow on this branch) is exercised by CI, not just by local --run-slow invocations. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

leifdenby · 2026-06-08T12:14:34Z

@sadamov I added #652 to say how I would generalise the config. I am not saying that we need to use that config layout with this PR, but you could maybe have a look at that PR and see what you think? If you do like to use it then we could maybe consider using a structure in this PR that could remain relatively unchanged down the line.

Per the simplification in mllam#651, the custom --run-slow flag is being dropped in favour of relying on pytest's native -m marker selection. Remove the corresponding pytest_addoption + pytest_collection_modifyitems from conftest and the --run-slow flag from CI on this branch too, so this PR doesn't re-introduce conflicts when mllam#651 lands. The slow marker registration in pyproject.toml stays, ready for use on any future @pytest.mark.slow test on this branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… constructor only) Pre-emptively land the public CONFIG-LAYER shape proposed in mllam#652 on top of the boundary-datastore work in mllam#635 so the model-side adapter (Joel's follow-up) doesn't have to break the schema again later. The return type of `WeatherDataset.__getitem__` stays as a tuple; the per-sample boundary tensor is dropped from the public output entirely (it's loaded internally for the data infrastructure that mllam#635 introduced, but not surfaced) because the model isn't consuming it today and the multi-source consumption path under mllam#652 will reintroduce it via a model-side `ForecastBatch` with per-source dicts. Config schema (neural_lam/config.py): - Replace `datastore` + `datastore_boundary` top-level keys with a single `datastores: Dict[str, DatastoreSelection]` mapping. The dict key becomes the canonical source name used throughout the pipeline. - `DatastoreSelection` grows optional `inputs:` and `outputs:` per- category variable include-lists (parsed but not honoured at runtime yet; a follow-up filters tensors by them once Joel's model-side consumption lands and decides the shape). - Validate at config load: raise InvalidConfigError when two datastores declare the same variable as an output, pointing at mdp's `dim_mapping.name_format` or `xr.Dataset.assign_coords` on the existing zarr's small `{category}_feature` coord. - `load_config_and_datastore` returns `(config, Dict[str, BaseDatastore])` rather than the legacy `(config, interior, boundary)` triple. WeatherDataset (neural_lam/weather_dataset.py): - Constructor takes `(datastores, selections, ...)` dicts directly. Internally resolves the single interior + optional boundary pair for its existing slicing/windowing logic (which is unchanged). - Boundary loading + windowing infrastructure stays intact - the boundary datastore is still loaded, the boundary forcing is still built per sample - but the per-sample tensor is intentionally not surfaced in __getitem__. - __getitem__ returns the pre-mllam#635 4-tuple `(init_states, target_states, forcing, target_times)` with an explicit TODO(mllam#652) marker at the construction site pointing Joel at where the model-side ForecastBatch will plug in. - `create_dataarray_from_tensor` refactored to expose a `build_dataarray_from_tensor` staticmethod so model.py can build DataArrays without instantiating a full WeatherDataset under the new constructor signature. ForecasterModule (neural_lam/models/module.py): - `on_after_batch_transfer` reverts to the pre-mllam#635 4-tuple unpack. - `common_step` reverts to the pre-mllam#635 4-tuple unpack. - `plot_examples` updates `time = batch[4]` to `time = batch[3]`. - Boundary normalisation buffers and tiled caches removed; the boundary datastore reference (`self.datastore_boundary`) is still held so that the future model-side adapter (mllam#652 follow-up) can re-introduce normalisation alongside the per-source dict consumption. - `_create_dataarray_from_tensor` uses the new `WeatherDataset.build_dataarray_from_tensor` staticmethod. Production call site (neural_lam/train_model.py): - Unpacks the new `(config, datastores)` return shape. - Uses `_resolve_datastore_roles` from weather_dataset to pick out the interior + boundary for the legacy ForecasterModule constructor. Example YAMLs (tests/datastore_examples/): - Single-source danra: `datastores: {danra: ...}`. - danra + era5 boundary: `datastores: {interior: ..., boundary: ...}` with explicit `outputs: {state: }` on interior so the resolver knows which one is the prognostic source. Intentionally deferred to mllam#652 follow-up: - Surfacing boundary in the per-sample output via a model-side ForecastBatch with per-source dicts. - Honouring `inputs:` / `outputs:` include-lists at runtime. - Diagnostic outputs (parsed in schema, not yet wired through). Refs mllam#635, refs mllam#652. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replace the single `datastore:` top-level config field with a `datastores:` mapping keyed by user-chosen names. Each entry is a DatastoreSelection with optional per-category `inputs` / `outputs` declarations; one datastore must declare outputs (the interior / prognostic source) and zero or more may contribute input-only sources that are reserved for the model-side multi-source consumption (the mllam#652 follow-up). WeatherDataset and WeatherDataModule take `datastores` and `selections` dicts; their per-sample return shape and the model unpack are unchanged from current main, so this is a config + data-loader constructor refactor only. Internally the dataset still operates on the interior datastore alone. load_config_and_datastore returns (config, Dict[str, BaseDatastore]). A config-time validator rejects two datastores declaring the same output variable name, with an error message pointing at mdp's `dim_mapping.name_format` and `xr.Dataset.assign_coords` as the two ways to disambiguate. Other callers updated: - train_model.py resolves interior + boundary roles for the legacy single-source model side. - create_graph.py and plot_graph.py resolve the interior datastore via `_resolve_datastore_roles` instead of the old 2-tuple. - module.py refactors `_create_dataarray_from_tensor` to use a new `WeatherDataset.build_dataarray_from_tensor` staticmethod so the model doesn't need to instantiate a full WeatherDataset with the new dict signature. This PR is an alternative to mllam#635: it adopts the public schema proposed in mllam#652 without bringing in mllam#635's internal boundary loading. Boundary forcing, multi-source inputs and diagnostic outputs land via the mllam#652 model-side follow-up. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Match the marker registration used on mllam#635/mllam#651 so any future @pytest.mark.slow test on this branch is recognised by pytest without warnings. No tests currently use the marker on mllam#656. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

sadamov · 2026-06-09T07:44:50Z

@sadamov I added #652 to say how I would generalise the config. I am not saying that we need to use that config layout with this PR, but you could maybe have a look at that PR and see what you think? If you do like to use it then we could maybe consider using a structure in this PR that could remain relatively unchanged down the line.

#652 (comment)

Register a `slow` marker in `pyproject.toml` and a matching `--run-slow` CLI flag in `tests/conftest.py`. Tests carrying `@pytest.mark.slow` are skipped by default and can be opted into via `pytest --run-slow`. Mark `test_training` (parametrised over all datastores) and `test_training_output_std` as slow because they run a real `trainer.fit` loop on the MDP / npyfilesmeps datastores and take minutes per parametrisation. The fast unit tests in the suite (`test_all_gather_cat_*`, datastore tests against the dummy fixture, etc.) continue to run on every `pytest` invocation. Motivated by #635 which already needed this mechanism for the ERA5 boundary integration test. Landing the infra separately so it's reusable across the suite (e.g. the `test_state_only_datastore_*` training test on #231 and future trainer.fit-based regressions). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

sadamov and others added 2 commits May 11, 2026 16:00

Update CHANGELOG PR link to mllam#635

3dc0045

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

sadamov self-assigned this May 11, 2026

sadamov added the enhancement New feature or request label May 11, 2026

sadamov added this to the v0.8.0 milestone May 11, 2026

sadamov requested review from joeloskarsson and leifdenby May 11, 2026 14:13

sadamov mentioned this pull request May 11, 2026

Add boundary data overlay to evaluation plots #636

Open

21 tasks

sadamov and others added 9 commits May 11, 2026 17:17

docs: mention MDP boundary-only datastore support in CHANGELOG

affb441

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: MDP get_vars_names/get_vars_long_names for any missing category

e5d8061

Guard against missing state/static feature keys in the zarr, not just forcing. Boundary-only datastores (e.g. ERA5) may lack state_feature entirely. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: add num_grid_points override to MDPDatastore

8d8bb77

Return len(grid_index) directly instead of computing from grid_shape_state, which is more robust for boundary-only datastores. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge branch 'main' into feat/boundary-datastore

2b12e76

Merge remote-tracking branch 'sadamov/feat/boundary-datastore' into f…

b3e4ef6

…eat/boundary-datastore

sadamov and others added 2 commits May 18, 2026 09:22

Merge branch 'main' into feat/boundary-datastore

a839263

joeloskarsson requested changes May 22, 2026

View reviewed changes

sadamov and others added 5 commits May 28, 2026 05:03

Merge branch 'main' of https://github.com/mllam/neural-lam into feat/…

41ee6f6

…boundary-datastore

sadamov and others added 2 commits May 28, 2026 17:08

sadamov requested a review from joeloskarsson May 28, 2026 19:27

sadamov requested review from SimonKamuk and observingClouds June 2, 2026 07:23

leifdenby mentioned this pull request Jun 8, 2026

[RFC/Design]: Support for multiple datastores #652

Open

sadamov mentioned this pull request Jun 9, 2026

feat: adopt #652 multi-datastore schema (config + constructor only, no model touch) sadamov/neural-lam#1

Merged

21 tasks

sadamov force-pushed the feat/boundary-datastore branch from 6d237d7 to 78a4d52 Compare June 9, 2026 07:25

sadamov mentioned this pull request Jun 9, 2026

feat(config): adopt #652 multi-datastore schema (config + WeatherDataset constructor only) #656

Open

21 tasks

sadamov added the discussion label Jun 9, 2026

		else:
		raise ValueError("Dataset has no state, forcing, or static data")

		normalized here so the work runs on the accelerator. The boundary
		forcing is passed through unchanged.

Conversation

sadamov commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe your changes

Issue Link

Type of change

Checklist before requesting a review

Checklist for reviewers

Author checklist after completed review

Checklist for assignee

Uh oh!

joeloskarsson left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

observingClouds commented Jun 2, 2026

Uh oh!

sadamov commented Jun 2, 2026

Uh oh!

leifdenby commented Jun 8, 2026

Uh oh!

sadamov commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

sadamov commented May 11, 2026 •

edited

Loading

sadamov commented Jun 9, 2026 •

edited

Loading