Skip to content

Add COSMO example notebook for onboarding (#587)#392

Open
info-gallary wants to merge 7 commits into
mllam:mainfrom
info-gallary:docs-helloworld-notebook
Open

Add COSMO example notebook for onboarding (#587)#392
info-gallary wants to merge 7 commits into
mllam:mainfrom
info-gallary:docs-helloworld-notebook

Conversation

@info-gallary

@info-gallary info-gallary commented Mar 13, 2026

Copy link
Copy Markdown

Describe your changes

This PR adds a new onboarding example notebook docs/notebooks/COSMO_example.ipynb that demonstrates the full Neural-LAM workflow (Datastore → Graph → Training → Evaluation → Visualization) using a COSMO-structured setup.

Key Improvements:

  • Onboarding Focused: Designed as a "getting started" guide that provides a complete overview of the library's capabilities.
  • CPU-Friendly: Uses synthetic, COSMO-structured Zarr data and reduced model dimensions, allowing the entire pipeline to be verified on standard hardware without a GPU or massive dataset downloads.
  • Repo Alignment: Moves documentation notebooks into the docs/notebooks/ directory to follow the project's organizational patterns.
  • Deduplication: Removes the generic HelloWorld.ipynb to avoid overlap with existing DANRA notebook work.

Motivation:
The maintainers requested a COSMO-specific example to serve as the primary onboarding documentation, replacing the previous general "hello world" implementation.

Dependencies:

  • neural-lam
  • mllam-data-prep (demonstrated in config)
  • weather-model-graphs (demonstrated in graph construction)
  • Standard stack: numpy, xarray, pandas, torch, pyyaml

Issue Link

closes #587

Type of change

  • 🐛 Bug fix (non-breaking change that fixes an issue)
  • ✨ New feature (non-breaking change that adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • 📖 Documentation (Addition or improvements to documentation)

Checklist before requesting a review

  • My branch is up-to-date with the target branch - if not update your fork with the changes from the target branch (use pull with --rebase option if possible).
  • I have performed a self-review of my code
  • For any new/modified functions/classes I have added docstrings that clearly describe its purpose, expected inputs and returned values
  • I have placed in-line comments to clarify the intent of any hard-to-understand passages of my code
  • I have updated the README to cover introduced code changes (No README changes required for this doc update)
  • I have added tests that prove my fix is effective or that my feature works (The notebook serves as an integration test)
  • I have given the PR a name that clearly describes the change, written in imperative form.
  • I have requested a reviewer and an assignee (assignee is responsible for merging). This applies only if you have write access to the repo, otherwise feel free to tag a maintainer to add a reviewer and assignee.

Checklist for reviewers

  • the code is readable
  • the code is well tested
  • the code is documented (including return types and parameters)
  • the code is easy to maintain

Author checklist after completed review

  • I have added a line to the CHANGELOG describing this change, in a section reflecting type of change:
    • added: COSMO_example.ipynb notebook to documentation for onboarding

Checklist for assignee

  • PR is up to date with the base branch
  • the tests pass
  • (if the PR is not just maintenance/bugfix) the PR is assigned to the next milestone. If it is not, propose it for a future milestone.
  • author has added an entry to the changelog (and designated the change as added, changed, fixed or maintenance)

@info-gallary info-gallary force-pushed the docs-helloworld-notebook branch from e27e1fe to 83b9421 Compare March 13, 2026 02:55
@info-gallary

Copy link
Copy Markdown
Author

Hi @sadamov , I’ve updated this PR to a COSMO example notebook as discussed. The notebook and changelog have been revised accordingly. Please let me know if any further changes are needed.

@sadamov sadamov self-requested a review March 13, 2026 09:43
@sadamov sadamov self-assigned this Mar 15, 2026
@sadamov sadamov added the documentation Improvements or additions to documentation label Mar 15, 2026

@sadamov sadamov left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there was a misunderstanding here: the notebook for cosmo is already created. There are however some outstanding issues that need to be adressed, as mentioned in:

#69 (comment)

@info-gallary great, you will face one main obstacle. Currently the cosmo data is hosted on the ETH research collection and is multiple 100GB in size. so for a hello world example we will need to host a tiny piece of that data on a more accessible storage option, similar to DANRA. You might want to convert this .md file to a jupyter notebook. Here is my md file for cosmo: https://github.com/joeloskarsson/neural-lam-dev/blob/research/docs/reproduce_paper_sample.md

also #69 is mostly talking about danra, so you shouldn't close that issue with this PR.

So to rephrase, the issue about the COSMO hello world example is: how to bring the notebook from the research branch into main and get it to run in a reasonable amout of time. The preparation and hosting of a readuced cosmo sample is very much the core work. I do understand that hosting data is not simple/cheap, but it is required here. Since this PR is not solving the issues I'll mark it as draft for now.
If you would like to start working on the example data hosting I suggest to contact @leifdenby as he has some options from what I understand 😉

@sadamov sadamov marked this pull request as draft March 16, 2026 05:26
@info-gallary

Copy link
Copy Markdown
Author

Thanks for the clarification, that makes sense. I understand now that the core missing work is preparing and hosting a reduced COSMO sample so the notebook can actually run in a reasonable time, rather than only converting the existing notebook into main. I’ll avoid closing #69 from this PR.

I’d still be interested in helping with the reduced sample data preparation/hosting side. I’ll reach out to @leifdenby to better understand the available hosting options and what would be most useful here.

I think there was a misunderstanding here: the notebook for cosmo is already created. There are however some outstanding issues that need to be adressed, as mentioned in:

#69 (comment)

@info-gallary great, you will face one main obstacle. Currently the cosmo data is hosted on the ETH research collection and is multiple 100GB in size. so for a hello world example we will need to host a tiny piece of that data on a more accessible storage option, similar to DANRA. You might want to convert this .md file to a jupyter notebook. Here is my md file for cosmo: https://github.com/joeloskarsson/neural-lam-dev/blob/research/docs/reproduce_paper_sample.md

also #69 is mostly talking about danra, so you shouldn't close that issue with this PR.

So to rephrase, the issue about the COSMO hello world example is: how to bring the notebook from the research branch into main and get it to run in a reasonable amout of time. The preparation and hosting of a readuced cosmo sample is very much the core work. I do understand that hosting data is not simple/cheap, but it is required here. Since this PR is not solving the issues I'll mark it as draft for now. If you would like to start working on the example data hosting I suggest to contact @leifdenby as he has some options from what I understand 😉

@sadamov sadamov added this to the v0.7.0 (proposed) milestone Apr 9, 2026
@sadamov sadamov self-requested a review April 9, 2026 09:09
@sadamov sadamov linked an issue Apr 10, 2026 that may be closed by this pull request
@sadamov sadamov marked this pull request as ready for review April 10, 2026 09:22
@sadamov

sadamov commented Apr 10, 2026

Copy link
Copy Markdown
Collaborator

Okay I organized the hello_world issue and PRs:

If there was some oversight let me know, tried my best to look through all previous comms.

@sadamov sadamov changed the title Docs: add COSMO example notebook for onboarding Add COSMO example notebook for onboarding (#587) Apr 13, 2026
@sadamov sadamov modified the milestones: v0.7.0, v0.8.0 May 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add COSMO example notebook for onboarding

3 participants