Updating the format of our rollout (`eval.py`) to make it easier to calculate metrics (true to the original data). by alxmrs · Pull Request #754 · m2lines/Samudra

alxmrs · 2026-06-02T23:57:29Z

Fixes #508. This makes the three changes in this issue by correcting the underlying data (adding essential coordinate information) and propagating that data during eval time. Since this is a data fix along with an eval output fix (IMO, this was the simplest approach), I provide two data engineering corrections:

I updated the original data engineering script to no longer omit the metadata that we need for evaluation. This way, we won't make the same mistake for future datasets.
I've added a script to update the existing datasets in place (they should be safe since we only add new coordinate information; we don't have to re-write any of the existing chunks).

My plan is that after this PR is reviewed (and merged), I'll run the coordinate update script on the public FOMO data. For good measure, I've run the backup script on this data so we can inspect it: s3://emulators/am16581/data/2025-11/om4_onedeg_v3/OM4.zarr.

alxmrs · 2026-06-04T00:21:52Z

@codex may I have your review?

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0cc2447628

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

alxmrs · 2026-06-04T00:53:56Z

Thanks, that was a genuine bug. I pushed a fix. @codex, may I have another round of review?

… issues, provided the data is properly structured.

…tion as well).

chatgpt-codex-connector · 2026-06-04T00:56:32Z

Codex Review: Didn't find any major issues. Chef's kiss.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

jder

Would be great to ask @adam-subel to look at some outputs here since this is a lot of new code and I'm finding it a bit hard to follow exactly how data is flowing & being transformed now.

I would rather re-run the full pipeline rather than using/keeping the backfill script + tests to avoid any drift. What datasets would you want to backfill? Can we reprocess them instead?

More fuzzily, I am somewhat worried about the proliferation of names, magic numbers, and name patterns like foo_i in our code and tests especially given we are starting to use other datasets which presumably will not match. WDYT?

jder · 2026-06-15T19:59:12Z

+DZ = np.array(
+    [
+        5,
+        10,
+        15,
+        20,
+        30,
+        50,
+        70,
+        100,
+        150,
+        200,
+        250,
+        300,
+        400,
+        500,
+        600,
+        800,
+        1000,
+        1000,
+        1000,
+    ],
+    dtype="float64",
+)
+LEV = np.array(
+    [
+        2.5,
+        10,
+        22.5,
+        40,
+        65,
+        105,
+        165,
+        250,
+        375,
+        550,
+        775,
+        1050,
+        1400,
+        1850,
+        2400,
+        3100,
+        4000,
+        5000,
+        6000,
+    ],
+    dtype="float64",


Is there somewhere we could read these from?

jder · 2026-06-15T20:05:42Z

+        base, _, idx = var_str.rpartition("_")
+        if base and idx.isdigit() and int(idx) < n_depths:


Instead of testing this heuristically, can we read the known set of variable names off the DatasetSpec?

jder · 2026-06-15T20:10:36Z

        self.time_buffer = None
+
+    def _output_coords(self) -> dict:
+        """Build CF-aligned output coordinates matching the ground-truth layout.


What is "CF-aligned"? What is "ground-truth layout"?

CF is the "climate and forest" metadata standards. They are the conventions that are defaults in Xarray, but only relevant to some Xarray-openable datasets, like ours.

Let me get back to you on the other question with a closer look.

https://cfconventions.org/Data/cf-conventions/cf-conventions-1.13/cf-conventions.html

jder · 2026-06-15T20:11:39Z

+            "lat": (("y", "x"), np.broadcast_to(y_vals[:, None], (ny, nx)).copy()),
+            "lon": (("y", "x"), np.broadcast_to(x_vals[None, :], (ny, nx)).copy()),


How about keeping the old lat/lon around as lat_2d lon_2d or something so that we don't need to recompute them here and potentially mess something up?

jder · 2026-06-15T20:18:38Z

+
+
+def _source_coords(ny, nx):
+    """Coords as `get_coords_dict` returns them after backfill: 1D lat/lon dims,


What does this have to do with backfill?

alxmrs changed the title ~~Experiment: how do we conserve global ocean area?~~ Updating the format of our rollout (eval.py) to make it easier to calculate metrics (true to the original data). Jun 3, 2026

alxmrs commented Jun 3, 2026

View reviewed changes

Comment thread data/ocean_preprocessing/areacello_xesmf_check.py Outdated

chatgpt-codex-connector Bot reviewed Jun 4, 2026

View reviewed changes

Comment thread src/ocean_emulators/utils/writer.py

alxmrs added 9 commits June 3, 2026 17:54

Experiment: how do we conserve global ocean area?

c45be71

Fix the data gen pipeline so it doesn't drop these essential variables.

c98bef8

Updated eval writer to address all three of the outstanding usability…

f592e04

… issues, provided the data is properly structured.

Script to backfill grid coordinates to existing datasets.

74b8764

Added test for proof of correctness.

54d1c8f

Added a rationale for the areacello grid decision (use the ocean frac…

458c597

…tion as well).

Omitting script that answered a question. Thank you for your service.

7665e46

Codex caught a subtle bug: what are the correct number of levels?

4f8799d

Helper to support remote zarr stores.

c495447

alxmrs force-pushed the u/alxmrs/eval-usability branch from 1c8b5fa to c495447 Compare June 4, 2026 00:54

alxmrs requested review from adam-subel and jder June 4, 2026 01:23

alxmrs marked this pull request as ready for review June 4, 2026 01:23

jder requested changes Jun 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Updating the format of our rollout (`eval.py`) to make it easier to calculate metrics (true to the original data).#754

Updating the format of our rollout (`eval.py`) to make it easier to calculate metrics (true to the original data).#754
alxmrs wants to merge 9 commits into
mainfrom
u/alxmrs/eval-usability

alxmrs commented Jun 2, 2026 •

edited

Loading

Uh oh!

Uh oh!

alxmrs commented Jun 4, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

alxmrs commented Jun 4, 2026

Uh oh!

chatgpt-codex-connector Bot commented Jun 4, 2026

Uh oh!

jder left a comment

Uh oh!

jder Jun 15, 2026

Uh oh!

jder Jun 15, 2026

Uh oh!

jder Jun 15, 2026

Uh oh!

alxmrs Jun 16, 2026

Uh oh!

jder Jun 15, 2026

Uh oh!

jder Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		base, _, idx = var_str.rpartition("_")
		if base and idx.isdigit() and int(idx) < n_depths:

		"lat": (("y", "x"), np.broadcast_to(y_vals[:, None], (ny, nx)).copy()),
		"lon": (("y", "x"), np.broadcast_to(x_vals[None, :], (ny, nx)).copy()),



		def _source_coords(ny, nx):
		"""Coords as `get_coords_dict` returns them after backfill: 1D lat/lon dims,

Conversation

alxmrs commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

alxmrs commented Jun 4, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

alxmrs commented Jun 4, 2026

Uh oh!

chatgpt-codex-connector Bot commented Jun 4, 2026

Uh oh!

jder left a comment

Choose a reason for hiding this comment

Uh oh!

jder Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

jder Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

jder Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

alxmrs Jun 16, 2026

Choose a reason for hiding this comment

Uh oh!

jder Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

jder Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

alxmrs commented Jun 2, 2026 •

edited

Loading