Add documentation/examples for new data loaders and help with use case

@jhamman just presented on some updates to xbatcher including the new data loader interfaces from #25. I tried to find a documented way of using it and I don't see one. If some could be added that would be great because I've been helping some people at my work use Satpy to prepare data for their machine learning projects and I think the data loader could be a nice optimization. Their preparation work has always ended with saving to NetCDF or zarr. My understanding of these interfaces in xbatcher is that that saving to disk step shouldn't be needed (except for future caching functionality). Is that correct?

The psuedo-code of the most recent project I helped looks something like this:

```python
dates_of_interest = [...]
geographic_bounds_of_interest = [...]

for dt in dates_of_interest:
    abi_filenames = get_goes16_abi_filenames(dt)
    scn = satpy.Scene(reader='abi_l1b', filenames=abi_filenames)
    scn.load(channels_of_interest)

    for bbox in geographic_bounds_of_interest:
        cropped_scn = scn.crop(xy_bbox=bbox)
        cropped_scn.save_datasets(filename="some_bbox_specific_file.nc")

```

And then they do their ML work based on those NetCDF files. Satpy is all `xarray[dask]`-based and the actual code for the above does a lot of `client.map` work (distributed's `Client`) to do the individual pieces. I can't speak for the researcher I'm helping, but I think if there is an optimization step here by using a data loader to give these "patches" (that's what they call them) of data to pytorch/tensorflow without needing to save to NetCDF that would be a really good example for a certain NASA project we're a part of.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add documentation/examples for new data loaders and help with use case #52

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add documentation/examples for new data loaders and help with use case #52

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions