codequestionask
codequestionask

Reputation: 31

how to label multiple columns effectively using pandas

I have a data columns look like below

a b c d e
1 0 0 0 0
0 2 0 0 0
3 0 0 0 0
0 0 0 1 0
0 0 1 0 0
0 0 0 0 1

For this dataframe I want to create a column called label

    a b c d e lable
    1 0 0 0 0  cola
    0 2 0 0 0  colb
    3 0 0 0 0  cola
    0 0 0 1 0  cold
    0 0 1 0 0  colc
    0 0 0 0 1  cole

The label is the column index

my prior code is df['label'] = df['a'].apply(lambda x: 1 if x!=0) but it doesn't work. Is there anyway to return the expected result?

Upvotes: 0

Views: 310

Answers (1)

Chris
Chris

Reputation: 16147

Try idxmax on axis 1

import pandas as pd
df = pd.DataFrame({'a': [1, 0, 3, 0, 0, 0],
 'b': [0, 2, 0, 0, 0, 0],
 'c': [0, 0, 0, 0, 1, 0],
 'd': [0, 0, 0, 1, 0, 0],
 'e': [0, 0, 0, 0, 0, 1]})

df['label'] = 'col'+df.idxmax(1)

Output

   a  b  c  d  e label
0  1  0  0  0  0  cola
1  0  2  0  0  0  colb
2  3  0  0  0  0  cola
3  0  0  0  1  0  cold
4  0  0  1  0  0  colc
5  0  0  0  0  1  cole

Upvotes: 2

Related Questions