I'm trying to calculate a monthly climatology for a subset of the time dimension in an xarray dataset. Time is defined using datetime64.
This works fine if I want to use the whole timeseries:
monthly_avr=ds_clm.groupby('time.month').mean(dim='time')
But I really only want years bigger than 2001. Neither of these work:
monthly_avr2=ds_clm.where(ds_clm.time>'2001-01-01').groupby('time.month').mean('time')
monthly_avr3=ds_clm.isel(time=slice('2001-01-01', '2018-01-01')).groupby('time.month').mean('time')
Here is what my dataset looks like:
<xarray.Dataset>
Dimensions: (hist_interval: 2, lat: 192, lon: 288, time: 1980)
Coordinates:
* lon (lon) float32 0.0 1.25 2.5 3.75 5.0 6.25 7.5 8.75 10.0 ...
* lat (lat) float32 -90.0 -89.057594 -88.11518 -87.172775 ...
* time (time) datetime64[ns] 1850-01-31 1850-02-28 1850-03-31 ...
Dimensions without coordinates: hist_interval
Data variables:
EFLX_LH_TOT (time, lat, lon) float32 0.26219246 0.26219246 0.26219246 ...
Does anyone know the correct syntax for subsetting in time using datetime64?
Indexing and selecting data in xarray by coordinate value is typically done using the sel()
method. In your case, something like the following example should work.
monthly_avr3 = ds_clm.sel(
time=slice('2001-01-01', '2018-01-01')).groupby('time.month').mean('time')
Using the where()
method can also be useful sometime but for your use case, you would also need to add the drop=True
option:
monthly_avr2 = ds_clm.where(
ds_clm['time.year'] > 2000, drop=True).groupby('time.month').mean('time')