Iterate over pandas df rows to change/amend values

Question

I have a df populated with numbers and nans similar to below:

df = pd.DataFrame({'date1': [2, 4, 3, 0],
               'date2': [2, np.nan, np.nan, 7],
               'date3': [10, 2, 1, 8],
               'date4': [4, 11, np.nan, 8],
               'date5': [10, 2, 1, 8]
              })


    date1   date2   date3   date4   date5
0     2     2.0      10      4.0     10
1     4     NaN      2       11.0    2
2     3     NaN      1       NaN     1
3     0     7.0      8       8.0     8

I am trying to iterate over rows using iterrows and replace all values that are either greater than 6 or equal to nan with the previous row value however I just read that you should avoid making changes to a df using iterrows. Is that really true and if so is there a built in method to get this done in pandas effectively? Thank you.

Expected output:

    date1   date2   date3   date4   date5
0     2      2.0     *2.0*   4.0     *4.0*
1     4      *4*       2     *2*       2
2     3      *3*       1     *1*       1
3     0      *0*      *0*    *0*      *0*

Ben.T · Accepted Answer

First mask all the value greater than 6, then use ffill along the columns (axis=1)

res = df.mask(df > 6).ffill(axis=1)
print(res)
   date1  date2  date3  date4  date5
0    2.0    2.0    2.0    4.0    4.0
1    4.0    4.0    2.0    2.0    2.0
2    3.0    3.0    1.0    1.0    1.0
3    0.0    0.0    0.0    0.0    0.0

Iterate over pandas df rows to change/amend values

Answers (1)

Related Questions