Fill NaNs in dataframe column depending on last value

Question

I would like to fill missing (NaN) values in a column with values that depend on the last non=NaN value. My data looks this this:

In [3]: A = pd.DataFrame(['X', np.nan, np.nan, 'Y',np.nan, np.nan, 'X', np.nan])

In [4]: A
Out[4]:
     0
0    X
1  NaN
2  NaN
3    Y
4  NaN
5  NaN
6    X
7  NaN

I am aware of the fillna function, but this is not exactly what I want to do. This gives me the following:

In [5]: A.fillna(method='ffill') # Not what I want to do
Out[5]:
   0
0  X
1  X
2  X
3  Y
4  Y
5  Y
6  X
7  X

For example, I would like to fill in a 'I' if the last value was 'X' and 'J' if the last value was 'Y'. I.e.

Out[5]: # How do I get this?
   0
0  X
1  I
2  I
3  Y
4  J
5  J
6  X
7  I

I am sure I could do this with a loop, but how do I do it without resorting to that?

root · Accepted Answer

You can create a dictionary mapping the preceding value to the desired fill value, then use fillna with an forward filled version of the DataFrame with the mapping applied by using replace and ffill:

nan_map = {'X': 'I', 'Y': 'J'}
A = A.fillna(A.replace(nan_map).ffill())

The resulting output:

Fill NaNs in dataframe column depending on last value

Answers (1)

Related Questions