Open
Description
When dealing with large sets of geometries it would be nice if we could partition (chunk) the geometry coordinate and the GeometryIndex based on spatial locality (thus requiring spatial sorting or shuffling), like explained for geo-dataframes in dask-geopandas' spatial partitioning user guide.
This would require a good amount of work both here and upstream, though:
- refactor
xarray.IndexVariable
to allow dask arrays or other lazy arrays More flexible index variables pydata/xarray#8124 (although we can work around this by using regularxarray.Variable
objects for now) - add some sort of generic, out-of-core index in Xarray Low memory/out-of-core index? pydata/xarray#1650, or add our own ad-hoc version here
- compute space-filling curve values from geometries (copied or factored out from dask-geopandas?)
- implement spatial shuffling, probably reusing what is being added in Xarray Add
GroupBy.shuffle_to_chunks()
pydata/xarray#9320
Metadata
Metadata
Assignees
Labels
No labels