Reputation: 5549
given the dataframe df
df = pd.DataFrame(data=[[np.nan,1],
[np.nan,np.nan],
[1,2],
[2,3],
[np.nan,np.nan],
[np.nan,np.nan],
[3,4],
[4,5],
[np.nan,np.nan],
[np.nan,np.nan]],columns=['A','B'])
df
Out[16]:
A B
0 NaN 1.0
1 NaN NaN
2 1.0 2.0
3 2.0 3.0
4 NaN NaN
5 NaN NaN
6 3.0 4.0
7 4.0 5.0
8 NaN NaN
9 NaN NaN
I would need to replace the nan
using the following rules:
1) if nan is at the beginning replace with the first values after the nan
2) if nan is in the middle of 2 or more values replace the nan with the average of these values
3) if nan is at the end replace with the last value
df
Out[16]:
A B
0 1.0 1.0
1 1.0 1.5
2 1.0 2.0
3 2.0 3.0
4 2.5 3.5
5 2.5 3.5
6 3.0 4.0
7 4.0 5.0
8 4.0 5.0
9 4.0 5.0
Upvotes: 1
Views: 188
Reputation: 862396
Use add
between forward filling and backfilling values, then divide by 2
and last replace last and first NaN
s:
df = df.bfill().add(df.ffill()).div(2).ffill().bfill()
print (df)
A B
0 1.0 1.0
1 1.0 1.5
2 1.0 2.0
3 2.0 3.0
4 2.5 3.5
5 2.5 3.5
6 3.0 4.0
7 4.0 5.0
8 4.0 5.0
9 4.0 5.0
Detail:
print (df.bfill().add(df.ffill()))
A B
0 NaN 2.0
1 NaN 3.0
2 2.0 4.0
3 4.0 6.0
4 5.0 7.0
5 5.0 7.0
6 6.0 8.0
7 8.0 10.0
8 NaN NaN
9 NaN NaN
Upvotes: 5