Reputation: 171
I have a df populated with numbers and nans similar to below:
df = pd.DataFrame({'date1': [2, 4, 3, 0],
'date2': [2, np.nan, np.nan, 7],
'date3': [10, 2, 1, 8],
'date4': [4, 11, np.nan, 8],
'date5': [10, 2, 1, 8]
})
date1 date2 date3 date4 date5
0 2 2.0 10 4.0 10
1 4 NaN 2 11.0 2
2 3 NaN 1 NaN 1
3 0 7.0 8 8.0 8
I am trying to iterate over rows using iterrows and replace all values that are either greater than 6 or equal to nan with the previous row value however I just read that you should avoid making changes to a df using iterrows. Is that really true and if so is there a built in method to get this done in pandas effectively? Thank you.
Expected output:
date1 date2 date3 date4 date5
0 2 2.0 *2.0* 4.0 *4.0*
1 4 *4* 2 *2* 2
2 3 *3* 1 *1* 1
3 0 *0* *0* *0* *0*
Upvotes: 1
Views: 90
Reputation: 29635
First mask
all the value greater than 6, then use ffill
along the columns (axis=1)
res = df.mask(df > 6).ffill(axis=1)
print(res)
date1 date2 date3 date4 date5
0 2.0 2.0 2.0 4.0 4.0
1 4.0 4.0 2.0 2.0 2.0
2 3.0 3.0 1.0 1.0 1.0
3 0.0 0.0 0.0 0.0 0.0
Upvotes: 2