Pandas fill NaN base on last available value and the next available value

Question

I have csv data looks like this:

I would like to fill the empty cells in B column with values in A, if in A column

last_available_value_before_the_NaNs_in_A.split()[-1] == next_available_value_after_the_NaNs_in_A.split()[-1]

the wanted result would be:

     A     B
0   x aa   aa 
1   z aa   aa
2          aa
3          aa
4   x aa   aa
5   z bb   bb
6   x bb   bb
7          
8   z cc   cc

data.loc(7,'B') will be NaN because data.loc(6,'A').split()[-1] == data.loc(8,'A').split()[-1] is false.

data.loc(5,'B') is 'bb' because data.loc(5,'A').split()[-1] == 'bb'

Thanks for your help!

a_guest · Accepted Answer

You could compare a version that uses ffill and one that uses bfill:

f = df.A.fillna(method='ffill').str.split().str[-1]
b = df.A.fillna(method='bfill').str.split().str[-1]
df.B.where(f != b, f, inplace=True)

Answers (2)