Description
What happened?
I'm not sure if this is a bug, or just surprising behavior.
When I have a dataset with timeseries variables, and I do a groupby_bins
operation followed by a mean()
operation, the timeseries data is silently dropped from the dataset, instead of being aggregated.
What did you expect to happen?
I expect the groupby_bins operation to be applied to time_series data when it is applicable to time series data. For example, in the example code below, the mean()
operation should have return the average time in each bin.
Some aggregation operations might not be well defined for time (arguably sum(), for example). In such cases I'd expect it should return nans or raise an error.
Minimal Complete Verifiable Example
import xarray as xr
import numpy as np
import pandas as pd
ds = xr.Dataset({
'measurement':('trial',np.arange(0,100,10)),
'time':('trial',pd.date_range("20240101T1500", "20240101T1501", 10))
},
coords={'trial':np.arange(10)}
)
ds_agged= ds.groupby_bins('trial',5).mean()
# 'time' variable is mmissing from results, but measurement is present
print(ds_agged)
MVCE confirmation
- Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
- Complete example — the example is self-contained, including all data and the text of any traceback.
- Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
- New issue — a search of GitHub Issues suggests this is not a duplicate.
- Recent environment — the issue occurs with the latest version of xarray and its dependencies.
Relevant log output
<xarray.Dataset> Size: 80B
Dimensions: (trial_bins: 5)
Coordinates:
* trial_bins (trial_bins) object 40B (-0.009, 1.8] (1.8, 3.6] ... (7.2, 9.0]
Data variables:
measurement (trial_bins) float64 40B 5.0 25.0 45.0 65.0 85.0
Anything else we need to know?
No response
Environment
INSTALLED VERSIONS
commit: None
python: 3.10.16 | packaged by conda-forge | (main, Dec 5 2024, 14:16:10) [GCC 13.3.0]
python-bits: 64
OS: Linux
OS-release: 6.8.0-52-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.14.3
libnetcdf: 4.9.2
xarray: 2025.3.1
pandas: 2.2.3
numpy: 2.1.3
scipy: 1.15.2
netCDF4: 1.7.2
pydap: 3.5.4
h5netcdf: 1.6.1
h5py: 3.13.0
zarr: 2.18.3
cftime: 1.6.4
nc_time_axis: 1.4.1
iris: 3.11.0
bottleneck: 1.4.2
dask: 2025.3.0
distributed: 2025.3.0
matplotlib: 3.10.1
cartopy: 0.24.0
seaborn: 0.13.2
numbagg: 0.9.0
fsspec: 2025.3.2
cupy: None
pint: 0.24.4
sparse: 0.16.0
flox: None
numpy_groupies: None
setuptools: 75.8.0
pip: 25.0
conda: None
pytest: None
mypy: None
IPython: 8.32.0
sphinx: None