ShreyGrover
ShreyGrover

Reputation: 35

Finding the maximum of a row in a dataframe and returning its column name in pandas

I have a data set of football players. I need to find the maximum of Penalties or Volleys for each player and add a column at the that prints the maximum value and also whether it was Penalties or Volleys. I tried the following code:

import pandas as pd
import numpy as np
df=pd.read_excel(r'C:\Users\shrey\Desktop\FullData.xlsx')
for j,i in df.iterrows():
   data=i[['Penalties','Volleys']]
   i['max']=np.max(data)
   i['max_attr']=i.idxmax()

But this gives me an error - reduction operation 'argmax' not allowed for this dtype How should I go about with it?

Upvotes: 1

Views: 71

Answers (1)

jpp
jpp

Reputation: 164823

You don't need to iterate rows here. Instead, you can use pd.DataFrame.max and pd.DataFrame.idxmax to perform vectorised calculations:

cols = ['Penalties', 'Volleys']

df['max'] = df[cols].max(1)
df['max_attr'] = df[cols].idxmax(1)

Here's a demo:

df = pd.DataFrame([[2, 3], [5, 1]], columns=['Penalties', 'Volleys'])

cols = ['Penalties', 'Volleys']

df['max'] = df[cols].max(1)
df['max_attr'] = df[cols].idxmax(1)

print(df)

   Penalties  Volleys  max   max_attr
0          2        3    3    Volleys
1          5        1    5  Penalties

Upvotes: 1

Related Questions