I want to do a rolling computation on missing data.
Sample Code: (For sake of simplicity I'm giving an example of a rolling sum but I want to do something more generic.)
foo = lambda z: z[pandas.notnull(z)].sum()
x = np.arange(10, dtype="float")
x[6] = np.NaN
x2 = pandas.Series(x)
pandas.rolling_apply(x2, 3, foo)
which produces:
0 NaN
1 NaN
2 3
3 6
4 9
5 12
6 NaN
7 NaN
8 NaN
9 24
I think that during the "rolling", window with missing data is being ignored for computation. I'm looking to get a result along the lines of:
0 NaN
1 NaN
2 3
3 6
4 9
5 12
6 9
7 12
8 15
9 24
In [7]: pandas.rolling_apply(x2, 3, foo, min_periods=2)
Out[7]:
0 NaN
1 1
2 3
3 6
4 9
5 12
6 9
7 12
8 15
9 24