-
Notifications
You must be signed in to change notification settings - Fork 19
healpix moc index #151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
healpix moc index #151
Conversation
Nice! Not sure this PR fully closes #143, though, since this is Healpix specific.
I'll take a stab at adding coordinate transforms in a dggs-agnostic way in another PR (at least for lat/lon auxiliary coordinates), hopefully this will provide a good baseline for adding a lazy coordinate for cell ids here. |
(this doesn't work yet because `moc::RangeMOC` does not derive from `serde`'s `Serialize` and `Deserialize`)
@benbovy, this appears to work properly now. The current behavior is:
Missing functionality (compared to I'll look into adding a few more tests before this is ready for merging. |
In the case of HealpixMocIndex yes, but maybe not in all cases. This is the kind of things that we might want to clarify in Xarray's documentation or via blog posts using, e.g., some flowcharts. |
agreed, although I believe that in the end it should end up somewhere in the documentation (blog posts can easily become out-of-date). Either way, I think we have all the tests necessary, so this should be ready for review / merging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work @keewis, this will be a great addition!
I did a 1st pass and left a few comments. I'll do a more in-depth review next week.
"cdshealpix", | ||
"healpix-geo>=0.0.3", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possible to make it an optional dependency? I'm concerned by the number of dependencies (compiled packages) xdggs would require as we add more support for various DGGS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
true, I think we should figure out how to resolve this. When we added plotting support we decided to install everything by default (it was easier that way), but that won't scale (see also #128). However, I think we should limit the builtin DGGS to just h3 and healpix, and have separate extension packages for others.
healpix-geo
is a bit special in that you will need it to support ellipsoidal healpix, so that may just stay required (or in the same extra as cdshealpix
, should we modify the code to make h3ronpy
and cdshealpix
optional)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, I think we should limit the builtin DGGS to just h3 and healpix
Yes we can limit builtin support to just a few DGGSs (S2 might probably also make sense), but even so users who may be interested in only H3 would need to download a few Healpix dependencies, which is probably not ideal.
We can address that later, though.
I think I addressed the review comments (and I'll punt the decision on optional vs required dependencies to a different PR), so this should be ready for another round of reviews? |
Note: This is a highly experimental PR (and a lot still has to be done for this to be usable)
This tries to use the
RangeMOCIndex
class from EOPF-DGGS/healpix-geo#31 instead of the standard pandas index.In theory, this should allow opening and decoding datasets up to level 29 without having to keep the cell ids in memory (and quite possibly we can skip loading the cell ids if we see that we have$12 \cdot 4^{\mathrm{level}}$ cells).
The current state does not support
dask
(so the cell ids will still be kept in memory), but this is definitely something that will be added as the PR progresses – I still have to understand how the coordinate transform indexes work in xarray.changelog.md
api.rst
cc @benbovy