Reputation: 29
In the dataframe, I want to create 3 new columns labeled as A-hat, B-hat, C-hat, whereby I want to return the highest value by comparing the original column A, B and C, else return as NaN
. Hence in each of the row, the 3 new column should return two NaN
and one highest value.
input df:
A B C
Date
2020-01-05 3.57 5.29 6.23
2020-01-04 4.98 9.64 7.58
2020-01-03 3.79 5.25 6.26
2020-01-02 3.95 5.65 6.61
2020-01-01 -3.10 -7.20 -8.16
output df:
A B C A-hat B-hat C-hat
Date
2020-01-05 3.57 5.29 6.23 NaN NaN 6.23
2020-01-04 4.98 9.64 7.58 NaN 9.64 NaN
2020-01-03 3.79 5.25 6.26 NaN NaN 6.26
2020-01-02 3.95 5.65 6.61 NaN NaN 6.61
2020-01-01 -3.10 -7.20 -8.16 -3.10 NaN NaN
How can I achieve this output?
Upvotes: 1
Views: 304
Reputation: 863791
You can compare maximal values by DataFrame.max
in DataFrame.eq
and set missing values by DataFrame.where
if not matched mask:
df = df.join(df.where(df.eq(df.max(axis=1), axis=0)).add_suffix('-hat'))
print (df)
A B C A-hat B-hat C-hat
Date
2020-01-05 3.57 5.29 6.23 NaN NaN 6.23
2020-01-04 4.98 9.64 7.58 NaN 9.64 NaN
2020-01-03 3.79 5.25 6.26 NaN NaN 6.26
2020-01-02 3.95 5.65 6.61 NaN NaN 6.61
2020-01-01 -3.10 -7.20 -8.16 -3.1 NaN NaN
Upvotes: 1