Pandas reindex dates in Groupby

clg4 picture clg4 · Aug 28, 2015 · Viewed 12.6k times · Source

I have a dataframe with sporadic dates as the index, and columns = 'id' and 'num'. I would like to pd.groupby the 'id' column, and apply the reindex to each group in the dataframe.

My sample dataset looks like this:

            id  num
2015-08-01  1   3
2015-08-05  1   5
2015-08-06  1   4
2015-07-31  2   1
2015-08-03  2   2
2015-08-06  2   3

My expected output once pd.reindex with ffill is:

            id  num
2015-08-01  1   3
2015-08-02  1   3
2015-08-03  1   3
2015-08-04  1   3
2015-08-05  1   5
2015-08-06  1   4
2015-07-31  2   1
2015-08-01  2   1
2015-08-02  2   1
2015-08-03  2   2
2015-08-04  2   2
2015-08-05  2   2
2015-08-06  2   3

I have tried this, among other things to no avail: newdf=df.groupby('id').reindex(method='ffill') Which returns error:AttributeError: Cannot access callable attribute 'reindex' of 'DataFrameGroupBy' objects, try using the 'apply' method

Any help would be much appreciated

Answer

JoeCondron picture JoeCondron · Aug 28, 2015

There's probably a slicker way to do this but this works:

def reindex_by_date(df):
    dates = pd.date_range(df.index.min(), df.index.max())
    return df.reindex(dates).ffill()

df.groupby('id').apply(reindex_by_date).reset_index(0, drop=True)