Hi @atrabattoni ,
thanks for the great work with xdas! Using it for SeisBench, I've found a few things that might make user comfort about the reading function better, so I wanted to start a discussion.
- I was wondering, if it would be possible to automatically infer the
engine for open_datarray? For users who know their data well, it's no problem specifying the correct engine, but for users who got data second-hand (like me), it's always a bit of a struggle to figure out. Looking at the different data types, it's probably pretty easy to develop a routine that identifies the engine based on the available attributes?
- Is there a reason why
open_dataarray and open_mfdataarray are two separate functions? From a user perspective, I have the same intent when calling either of them. The change is just in the backend. Maybe they could be unified into an open function. The obspy read function works this way.
- It might be useful to allow the open function to work as a context manager. That would especially handle the file pointers more smoothly and fit closely with the way the underlying h5py library works. The lifecycle of the data array would then be clear. This could also be complimented with an
xdas.read function, that does the same as open with the difference that it directly reads data into memory and is therefore not a context manager. The code could look something like this:
# Option with open
with xdas.open("mydasdata.nc") as da:
# da is a virtual array
picks = model.classify(da) # Some processing call
# Option with read
da = xdas.read("mydasdata.nc") # da is directly read into memory, no context manager required
As I said in the beginning, these are just suggestion that I think might make the file reading more smooth. I totally understand that their might be reasons against this (or that you might not actually have the capacity to implement this).
Hi @atrabattoni ,
thanks for the great work with xdas! Using it for SeisBench, I've found a few things that might make user comfort about the reading function better, so I wanted to start a discussion.
engineforopen_datarray? For users who know their data well, it's no problem specifying the correct engine, but for users who got data second-hand (like me), it's always a bit of a struggle to figure out. Looking at the different data types, it's probably pretty easy to develop a routine that identifies the engine based on the available attributes?open_dataarrayandopen_mfdataarrayare two separate functions? From a user perspective, I have the same intent when calling either of them. The change is just in the backend. Maybe they could be unified into anopenfunction. The obspy read function works this way.xdas.readfunction, that does the same as open with the difference that it directly reads data into memory and is therefore not a context manager. The code could look something like this:As I said in the beginning, these are just suggestion that I think might make the file reading more smooth. I totally understand that their might be reasons against this (or that you might not actually have the capacity to implement this).