jeny ericsoon
jeny ericsoon

Reputation: 131

Interpolation based on unique value in a data frame

I am trying to use interpolation (linear) to fill in the missing values in my data frame. The interpolation should apply on the group of rows (which have the same id ) separately. An example of the data frame is below:

   mdata:
       id   f1      f2   f3    f4     f5
       d1   34      3    5     nan    6
       d1   nan     4    6     9      7
       d1   37    nan    6     10     8
       d2   nan     7    8     1      32    
       d2   12      8   nan    45     56    
       d2   13      9    11    46     59    

Given the above example , I want to apply the interpolation function on the rows which have id1, then id2 and etc. I tried to group them and then use interpolation, but it seems something is wrong in my code:

   mdata=[~mdata['id'].map(mdata.groupby('id').apply(mdata.interpolate(method 
   ='linear', limit_direction ='both')))] 

My desired output should be something like this:

 output:
       id   f1      f2   f3    f4     f5
       d1   34      3    5      9    6
       d1   35.5    4    6     9      7
       d1   37      5    6     10     8
       d2   12     7     8     1      32    
       d2   12      8   9.5    45     56    
       d2   13      9    11    46     59
   

Upvotes: 1

Views: 137

Answers (1)

Anurag Dabas
Anurag Dabas

Reputation: 24314

You can define a function:

def f(x):
    return x.interpolate(method ='linear', limit_direction ='both')

#Finally:
mdata=mdata.groupby('id').apply(f)

OR

via anonymous function:

mdata=(mdata.groupby('id')
            .apply(lambda x:x.interpolate(method ='linear', limit_direction ='both')))

output of mdata:

   id    f1   f2    f3    f4  f5
0  d1  34.0  3.0   5.0   9.0   6
1  d1  35.5  4.0   6.0   9.0   7
2  d1  37.0  4.0   6.0  10.0   8
3  d2  12.0  7.0   8.0   1.0  32
4  d2  12.0  8.0   9.5  45.0  56
5  d2  13.0  9.0  11.0  46.0  59

Upvotes: 2

Related Questions