Reputation: 131
I am trying to use interpolation (linear) to fill in the missing values in my data frame. The interpolation should apply on the group of rows (which have the same id ) separately. An example of the data frame is below:
mdata:
id f1 f2 f3 f4 f5
d1 34 3 5 nan 6
d1 nan 4 6 9 7
d1 37 nan 6 10 8
d2 nan 7 8 1 32
d2 12 8 nan 45 56
d2 13 9 11 46 59
Given the above example , I want to apply the interpolation function on the rows which have id1, then id2 and etc. I tried to group them and then use interpolation, but it seems something is wrong in my code:
mdata=[~mdata['id'].map(mdata.groupby('id').apply(mdata.interpolate(method
='linear', limit_direction ='both')))]
My desired output should be something like this:
output:
id f1 f2 f3 f4 f5
d1 34 3 5 9 6
d1 35.5 4 6 9 7
d1 37 5 6 10 8
d2 12 7 8 1 32
d2 12 8 9.5 45 56
d2 13 9 11 46 59
Upvotes: 1
Views: 137
Reputation: 24314
You can define a function:
def f(x):
return x.interpolate(method ='linear', limit_direction ='both')
#Finally:
mdata=mdata.groupby('id').apply(f)
OR
via anonymous function:
mdata=(mdata.groupby('id')
.apply(lambda x:x.interpolate(method ='linear', limit_direction ='both')))
output of mdata
:
id f1 f2 f3 f4 f5
0 d1 34.0 3.0 5.0 9.0 6
1 d1 35.5 4.0 6.0 9.0 7
2 d1 37.0 4.0 6.0 10.0 8
3 d2 12.0 7.0 8.0 1.0 32
4 d2 12.0 8.0 9.5 45.0 56
5 d2 13.0 9.0 11.0 46.0 59
Upvotes: 2