Pietro Bellamio
Pietro Bellamio

Reputation: 3

how to create a column with the index of the biggest among other columns AND some condition

I have a dataset with some columns, I want to create another column, where values are the column name of the variable with the highest value BUT different from 1

For Example:

df = pd.DataFrame({'A': [1, 0.2, 0.1, 0],
                    'B': [0.2,1, 0, 0.5],
                    'C': [1, 0.4, 0.3, 1]},
                   index=['1', '2', '3', '4'])
df
index A B C
1 1.0 0.2 1.0
2 0.2 1.0 0.4
3 0.1 0.0 0.3
4 0.0 0.5 1.0

Should give an output like

index A B C NEWCOL
1 1.0 0.2 1.0 B
2 0.2 0.3 0.1 C
3 0.1 0.4 0.2 B
4 0.0 0.5 1.0 B
df2['newcol'] = df2.idxmax(axis=1) if df2.max(index=1) != 1 

but didn't work

Upvotes: 0

Views: 32

Answers (1)

Naveed
Naveed

Reputation: 11650

here is one way to do it

# filter out the data that is 1 and find the id of the max value using idxmax
df['newcol']=df[~df.isin([1])].idxmax(axis=1)
df
      A       B     C   newcol
1   1.0     0.2     1.0     B
2   0.2     1.0     0.4     C
3   0.1     0.0     0.3     C
4   0.0     0.5     1.0     B

PS: your input, starting and expected data don't match. The above is based on the input DF

Upvotes: 0

Related Questions