Use center in pandas rolling when using a time-series

karen picture karen · Oct 30, 2017 · Viewed 10k times · Source

I am trying to set center=True in pandas rolling function, for a time-series:

import pandas as pd
series = pd.Series(1, index = pd.date_range('2014-01-01', '2014-04-01', freq = 'D'))
series.rolling('7D', min_periods=1, center=True, closed='left')

But output is:

---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
<ipython-input-6-6b30c16a2d12> in <module>()
      1 import pandas as pd
      2 series = pd.Series(1, index = pd.date_range('2014-01-01', '2014-04-01', freq = 'D'))
----> 3 series.rolling('7D', min_periods=1, center=True, closed='left')

~\Anaconda3\lib\site-packages\pandas\core\generic.py in rolling(self, window, min_periods, freq, center, win_type, on, axis, closed)
   6193                                    min_periods=min_periods, freq=freq,
   6194                                    center=center, win_type=win_type,
-> 6195                                    on=on, axis=axis, closed=closed)
   6196 
   6197         cls.rolling = rolling

~\Anaconda3\lib\site-packages\pandas\core\window.py in rolling(obj, win_type, **kwds)
   2050         return Window(obj, win_type=win_type, **kwds)
   2051 
-> 2052     return Rolling(obj, **kwds)
   2053 
   2054 

~\Anaconda3\lib\site-packages\pandas\core\window.py in __init__(self, obj, window, min_periods, freq, center, win_type, axis, on, closed, **kwargs)
     84         self.win_freq = None
     85         self.axis = obj._get_axis_number(axis) if axis is not None else None
---> 86         self.validate()
     87 
     88     @property

~\Anaconda3\lib\site-packages\pandas\core\window.py in validate(self)
   1090             # we don't allow center
   1091             if self.center:
-> 1092                 raise NotImplementedError("center is not implemented "
   1093                                           "for datetimelike and offset "
   1094                                           "based windows")

NotImplementedError: center is not implemented for datetimelike and offset based windows

Expected output is the one generated by:

import pandas as pd
series = pd.Series(1, index = pd.date_range('2014-01-01', '2014-04-01', freq = 'D'))
series.rolling(7, min_periods=1, center=True).sum().head(10)

2014-01-01    4.0
2014-01-02    5.0
2014-01-03    6.0
2014-01-04    7.0
2014-01-05    7.0
2014-01-06    7.0
2014-01-07    7.0
2014-01-08    7.0
2014-01-09    7.0
2014-01-10    7.0
Freq: D, dtype: float64

But using datetime like offsets, since it simplifies part of my other code (not posted here).

Is there any alternative solution?

Thanks

Answer

PJW picture PJW · Dec 12, 2017

Try the following (tested with pandas==0.23.3):

series.rolling('7D', min_periods=1, closed='left').sum().shift(-84, freq='h')

This will center your rolling sum in the 7-day window (by shifting -3.5 days), and will allow you to use a 'datetimelike' string for defining the window size. Note that shift() only takes an integer, thus defining with hours.

This will produce your desired output:

series.rolling('7D', min_periods=1, closed='left').sum().shift(-84, freq='h')['2014-01-01':].head(10)

2014-01-01 12:00:00    4.0
2014-01-02 12:00:00    5.0
2014-01-03 12:00:00    6.0
2014-01-04 12:00:00    7.0
2014-01-05 12:00:00    7.0
2014-01-06 12:00:00    7.0
2014-01-07 12:00:00    7.0
2014-01-08 12:00:00    7.0
2014-01-09 12:00:00    7.0
2014-01-10 12:00:00    7.0
Freq: D, dtype: float64

Note that the rolling sum is assigned to the center of the 7-day windows (using midnight to midnight timestamps), so the centered timestamp includes '12:00:00'.

Another option (as you show at the end of your question) is to resample the data to make sure it has even Datetime frequency, then use an integer for window size (window = 7) and center=True. However, you state that other parts of your code benefit from defining window with a 'datetimelike' string, so perhaps this option is not ideal.