Reputation: 951
Description: I am trying to interpolate missine values (represented as NaN), however the method is only work on the NaN values between known values. I am slo quite confused on how the value of the missing values are computed in bfill. As I understand it is only fill missing values by the same value as the first succeeding known value. Here's an example:
>>> df = pd.DataFrame([['M', '2014-01-01 00:26:00', '2'], ['M', 'M', 'M'], ['M', '2014-01-01 00:26:30', 9],[5, '2014-01-01 00:26:50', 'M'],[6, '2014-01-01 00:26:50', 'M']], columns=['x','y','z'])
>>> df
x y z
0 M 2014-01-01 00:26:00 2
1 M M M
2 M 2014-01-01 00:26:30 9
3 5 2014-01-01 00:26:50 M
4 6 2014-01-01 00:26:50 M
>>> df = df.replace(['M'],[np.NaN])
>>> df
x y z
0 NaN 2014-01-01 00:26:00 2
1 NaN NaN NaN
2 NaN 2014-01-01 00:26:30 9
3 5 2014-01-01 00:26:50 NaN
4 6 2014-01-01 00:26:50 NaN
>>> df['x'] = df['x'].astype(np.float64)
>>> df['z'] = df['z'].astype(np.float64)
>>> df['y'] = pd.to_datetime(df['y'])
>>> df
x y z
0 NaN 2014-01-01 00:26:00 2
1 NaN NaT NaN
2 NaN 2014-01-01 00:26:30 9
3 5 2014-01-01 00:26:50 NaN
4 6 2014-01-01 00:26:50 NaN
>>> df.interpolate()
x y z
0 NaN 2014-01-01 00:26:00 2.0
1 NaN NaT 5.5
2 NaN 2014-01-01 00:26:30 9.0
3 5 2014-01-01 00:26:50 9.0
4 6 2014-01-01 00:26:50 9.0
>>> df.interpolate(method='bfill')# try to fill first three rows in x
x y z
0 2 2014-01-01 00:26:00 2
1 NaN NaT NaN
2 9 2014-01-01 00:26:30 9
3 5 2014-01-01 00:26:50 NaN
4 6 2014-01-01 00:26:50 NaN
Goal: I want to fill x and z and if there is a possibility to fill y which has datetime type.
Upvotes: 0
Views: 2133
Reputation: 31662
IIUC you could use interpolate
to get your values for z
column and then fillna
with bfill
:
In [122]: df.interpolate().fillna(method='bfill')
Out[122]:
x y z
0 5 2014-01-01 00:26:00 2.0
1 5 2014-01-01 00:26:30 5.5
2 5 2014-01-01 00:26:30 9.0
3 5 2014-01-01 00:26:50 9.0
4 6 2014-01-01 00:26:50 9.0
Or:
In [128]: df.fillna(method='bfill').interpolate()
Out[128]:
x y z
0 5 2014-01-01 00:26:00 2
1 5 2014-01-01 00:26:30 9
2 5 2014-01-01 00:26:30 9
3 5 2014-01-01 00:26:50 9
4 6 2014-01-01 00:26:50 9
Sequence of the methods depends on how are you want to fill last column
Upvotes: 3