I've got a dataframe with the following information:
filename val1 val2
t
1 file1.csv 5 10
2 file1.csv NaN NaN
3 file1.csv 15 20
6 file2.csv NaN NaN
7 file2.csv 10 20
8 file2.csv 12 15
I would like to interpolate the values in the dataframe based on the indices, but only within each file group.
To interpolate, I would normally do
df = df.interpolate(method="index")
And to group, I do
grouped = df.groupby("filename")
I would like the interpolated dataframe to look like this:
filename val1 val2
t
1 file1.csv 5 10
2 file1.csv 10 15
3 file1.csv 15 20
6 file2.csv NaN NaN
7 file2.csv 10 20
8 file2.csv 12 15
Where the NaN's are still present at t = 6 since they are the first items in the file2 group.
I suspect I need to use "apply", but haven't been able to figure out exactly how...
grouped.apply(interp1d)
...
TypeError: __init__() takes at least 3 arguments (2 given)
Any help would be appreciated.
>>> df.groupby('filename').apply(lambda group: group.interpolate(method='index'))
filename val1 val2
t
1 file1.csv 5 10
2 file1.csv 10 15
3 file1.csv 15 20
6 file2.csv NaN NaN
7 file2.csv 10 20
8 file2.csv 12 15