Fill in missing pandas data with previous non-missing value, grouped by key

python pandas nan missing-data data-cleaning

ChrisB · May 2, 2013 · Viewed 13.7k times · Source

I am dealing with pandas DataFrames like this:

I would like to replace each NAN 'x' with the previous non-NAN 'x' from a row with the same 'id' value:

Is there some slick way to do this without manually looping over rows?

Answer

You could perform a groupby/forward-fill operation on each group:

import numpy as np
import pandas as pd

df = pd.DataFrame({'id': [1,1,2,2,1,2,1,1], 'x':[10,20,100,200,np.nan,np.nan,300,np.nan]})
df['x'] = df.groupby(['id'])['x'].ffill()
print(df)

yields

   id      x
0   1   10.0
1   1   20.0
2   2  100.0
3   2  200.0
4   1   20.0
5   2  200.0
6   1  300.0
7   1  300.0

Fill in missing pandas data with previous non-missing value, grouped by key

Answer

Related questions