I have a dataframe with sporadic dates as the index, and columns = 'id' and 'num'. I would like to pd.groupby
the 'id' column, and apply the reindex to each group in the dataframe.
My sample dataset looks like this:
id num
2015-08-01 1 3
2015-08-05 1 5
2015-08-06 1 4
2015-07-31 2 1
2015-08-03 2 2
2015-08-06 2 3
My expected output once pd.reindex
with ffill
is:
id num
2015-08-01 1 3
2015-08-02 1 3
2015-08-03 1 3
2015-08-04 1 3
2015-08-05 1 5
2015-08-06 1 4
2015-07-31 2 1
2015-08-01 2 1
2015-08-02 2 1
2015-08-03 2 2
2015-08-04 2 2
2015-08-05 2 2
2015-08-06 2 3
I have tried this, among other things to no avail:
newdf=df.groupby('id').reindex(method='ffill')
Which returns error:AttributeError: Cannot access callable attribute 'reindex' of 'DataFrameGroupBy' objects, try using the 'apply' method
Any help would be much appreciated
There's probably a slicker way to do this but this works:
def reindex_by_date(df):
dates = pd.date_range(df.index.min(), df.index.max())
return df.reindex(dates).ffill()
df.groupby('id').apply(reindex_by_date).reset_index(0, drop=True)