Reputation: 105
Suppose I have the following df
import pandas as pd
import numpy as np
test = pd.DataFrame(data = {
'a': [1,np.nan,np.nan,4,np.nan,5,6,7,8,np.nan,np.nan,6],
'b': [10,np.nan,np.nan,1,np.nan,1,1,np.nan,1,1,np.nan,1]
})
I would like to use pd.fillna(method='ffill') but only when two separate condition are both met:
Note: the first row can never be NaN
I am looking for a smart way - maybe a lambda expression or a vectorized form, avoiding for loop or .iterrows()
Required result:
result= pd.DataFrame(data = {
'a': [1,1,1,4,np.nan,5,6,7,8,np.nan,np.nan,6],
'b': [10,10,10,1,np.nan,1,1,np.nan,1,1,np.nan,1]
})
Upvotes: 1
Views: 332
Reputation: 862551
You can test if forward filled value in b
is 10
and also if all columns are filled with missing values and pass to DataFrame.mask
with ffill
:
mask = test['b'].ffill().eq(10) & test.isna().all(axis=1)
test = test.mask(mask, test.ffill())
print (test)
a b
0 1.0 10.0
1 1.0 10.0
2 1.0 10.0
3 4.0 1.0
4 NaN NaN
5 5.0 1.0
6 6.0 1.0
7 7.0 NaN
8 8.0 1.0
9 NaN 1.0
10 NaN NaN
11 6.0 1.0
Upvotes: 4