I am trying to set center=True in pandas rolling function, for a time-series:
import pandas as pd
series = pd.Series(1, index = pd.date_range('2014-01-01', '2014-04-01', freq = 'D'))
series.rolling('7D', min_periods=1, center=True, closed='left')
But output is:
---------------------------------------------------------------------------
NotImplementedError Traceback (most recent call last)
<ipython-input-6-6b30c16a2d12> in <module>()
1 import pandas as pd
2 series = pd.Series(1, index = pd.date_range('2014-01-01', '2014-04-01', freq = 'D'))
----> 3 series.rolling('7D', min_periods=1, center=True, closed='left')
~\Anaconda3\lib\site-packages\pandas\core\generic.py in rolling(self, window, min_periods, freq, center, win_type, on, axis, closed)
6193 min_periods=min_periods, freq=freq,
6194 center=center, win_type=win_type,
-> 6195 on=on, axis=axis, closed=closed)
6196
6197 cls.rolling = rolling
~\Anaconda3\lib\site-packages\pandas\core\window.py in rolling(obj, win_type, **kwds)
2050 return Window(obj, win_type=win_type, **kwds)
2051
-> 2052 return Rolling(obj, **kwds)
2053
2054
~\Anaconda3\lib\site-packages\pandas\core\window.py in __init__(self, obj, window, min_periods, freq, center, win_type, axis, on, closed, **kwargs)
84 self.win_freq = None
85 self.axis = obj._get_axis_number(axis) if axis is not None else None
---> 86 self.validate()
87
88 @property
~\Anaconda3\lib\site-packages\pandas\core\window.py in validate(self)
1090 # we don't allow center
1091 if self.center:
-> 1092 raise NotImplementedError("center is not implemented "
1093 "for datetimelike and offset "
1094 "based windows")
NotImplementedError: center is not implemented for datetimelike and offset based windows
Expected output is the one generated by:
import pandas as pd
series = pd.Series(1, index = pd.date_range('2014-01-01', '2014-04-01', freq = 'D'))
series.rolling(7, min_periods=1, center=True).sum().head(10)
2014-01-01 4.0
2014-01-02 5.0
2014-01-03 6.0
2014-01-04 7.0
2014-01-05 7.0
2014-01-06 7.0
2014-01-07 7.0
2014-01-08 7.0
2014-01-09 7.0
2014-01-10 7.0
Freq: D, dtype: float64
But using datetime like offsets, since it simplifies part of my other code (not posted here).
Is there any alternative solution?
Thanks
Try the following (tested with pandas==0.23.3
):
series.rolling('7D', min_periods=1, closed='left').sum().shift(-84, freq='h')
This will center your rolling sum in the 7-day window (by shifting -3.5 days), and will allow you to use a 'datetimelike' string for defining the window size. Note that shift()
only takes an integer, thus defining with hours.
This will produce your desired output:
series.rolling('7D', min_periods=1, closed='left').sum().shift(-84, freq='h')['2014-01-01':].head(10)
2014-01-01 12:00:00 4.0
2014-01-02 12:00:00 5.0
2014-01-03 12:00:00 6.0
2014-01-04 12:00:00 7.0
2014-01-05 12:00:00 7.0
2014-01-06 12:00:00 7.0
2014-01-07 12:00:00 7.0
2014-01-08 12:00:00 7.0
2014-01-09 12:00:00 7.0
2014-01-10 12:00:00 7.0
Freq: D, dtype: float64
Note that the rolling sum is assigned to the center of the 7-day windows (using midnight to midnight timestamps), so the centered timestamp includes '12:00:00'.
Another option (as you show at the end of your question) is to resample the data to make sure it has even Datetime frequency, then use an integer for window size (window = 7
) and center=True
. However, you state that other parts of your code benefit from defining window
with a 'datetimelike' string, so perhaps this option is not ideal.