Reputation: 2213
I would like to fill missing (NaN) values in a column with values that depend on the last non=NaN value. My data looks this this:
In [3]: A = pd.DataFrame(['X', np.nan, np.nan, 'Y',np.nan, np.nan, 'X', np.nan])
In [4]: A
Out[4]:
0
0 X
1 NaN
2 NaN
3 Y
4 NaN
5 NaN
6 X
7 NaN
I am aware of the fillna
function, but this is not exactly what I want to do. This gives me the following:
In [5]: A.fillna(method='ffill') # Not what I want to do
Out[5]:
0
0 X
1 X
2 X
3 Y
4 Y
5 Y
6 X
7 X
For example, I would like to fill in a 'I' if the last value was 'X' and 'J' if the last value was 'Y'. I.e.
Out[5]: # How do I get this?
0
0 X
1 I
2 I
3 Y
4 J
5 J
6 X
7 I
I am sure I could do this with a loop, but how do I do it without resorting to that?
Upvotes: 2
Views: 152
Reputation: 33793
You can create a dictionary mapping the preceding value to the desired fill value, then use fillna
with an forward filled version of the DataFrame with the mapping applied by using replace
and ffill
:
nan_map = {'X': 'I', 'Y': 'J'}
A = A.fillna(A.replace(nan_map).ffill())
The resulting output:
0
0 X
1 I
2 I
3 Y
4 J
5 J
6 X
7 I
Upvotes: 6