mattiasostmar
mattiasostmar

Reputation: 3109

Find max value in row and fill new column with that columns name using Pandas?

I have a Pandas dataframe that looks like this:

Index, cat1, cat2, cat3
1,0.3,0.1,0.4
2,0.5,0.1,0.2
3,0.1,0.4,0.3

I want to create a fifth column "max_cat" with the column name of the category with the highest value like so:

Index, cat1, cat2, cat3, max_cat
1,0.3,0.1,0.4, cat3
2,0.5,0.1,0.2, cat1
3,0.1,0.4,0.3, cat2

How would I go about achieving that, preferably using Pandas?

Here is my code:

import pandas as pd
from io import StringIO
data = StringIO("""
Index, cat1, cat2, cat3
1,0.3,0.1,0.4
2,0.5,0.1,0.2
3,0.1,0.4,0.3
""")
df = pd.read_csv(data, skiprows=1, header=0, names=["cat1","cat2","cat3"])

Upvotes: 2

Views: 1461

Answers (1)

behzad.nouri
behzad.nouri

Reputation: 77951

if you are not worrying about ties, then you can use idxmax, with axis=1 on the cati columns:

>>> df['max_cat'] = df[['cat1', 'cat2', 'cat3']].idxmax(axis=1)
>>> df
   Index  cat1  cat2  cat3 max_cat
0      1   0.3   0.1   0.4    cat3
1      2   0.5   0.1   0.2    cat1
2      3   0.1   0.4   0.3    cat2

if you want to see the ties, then see this question.

Upvotes: 2

Related Questions