Reputation: 151
I have a problem with pandas interpolate(). I only want to interpolate when there are not more than 2 succsessive "np.nans". But the interpolate function tries to interpolate also single values when there are more than 2 np.nans!?
s = pd.Series(data = [np.nan,10,np.nan,np.nan,np.nan,5,np.nan,6,np.nan,np.nan,30])
a = s.interpolate(limit=2,limit_area='inside')
print(a)
the output I get is:
0 NaN
1 10.00
2 8.75
3 7.50
4 NaN
5 5.00
6 5.50
7 6.00
8 14.00
9 22.00
10 30.00
dtype: float64
I do not want the result in line 2 and 3. What I want is:
0 NaN
1 10.00
2 NaN
3 NaN
4 NaN
5 5.00
6 5.50
7 6.00
8 14.00
9 22.00
10 30.00
dtype: float64
Can anybody please help?
Upvotes: 1
Views: 191
Reputation: 30920
Groupby.transform
with Series.where
s_notna = s.notna()
m = (s.groupby(s_notna.cumsum()).transform('size').le(3) | s_notna)
s = s.interpolate(limit_are='inside').where(m)
print(s)
Output
0 NaN
1 10.0
2 NaN
3 NaN
4 NaN
5 5.0
6 5.5
7 6.0
8 14.0
9 22.0
10 30.0
dtype: float64
Upvotes: 1