user9639519
user9639519

Reputation:

Most efficient way to return Column name in a pandas df

I have a pandas df that contains 4 different columns. For every row theres a value thats of importance. I want to return the Column name where that value is displayed. So for the df below I want to return the Column name when the value 2 is labelled.

d = ({
    'A' : [2,0,0,2],     
    'B' : [0,0,2,0],
    'C' : [0,2,0,0],            
    'D' : [0,0,0,0], 
    })

df = pd.DataFrame(data=d)

Output:

   A  B  C  D
0  2  0  0  0
1  0  0  2  0
2  0  2  0  0
3  2  0  0  0

So it would be A,C,B,A

I'm doing this via

m = (df == 2).idxmax(axis=1)[0]

And then changing the row. But this isn't very efficient.

I'm also hoping to produce the output as a Series from pandas df

Upvotes: 1

Views: 117

Answers (1)

cs95
cs95

Reputation: 402922

Use DataFrame.dot:

df.astype(bool).dot(df.columns).str.cat(sep=',')

Or,

','.join(df.astype(bool).dot(df.columns))

'A,C,B,A'

Or, as a list:

df.astype(bool).dot(df.columns).tolist()
['A', 'C', 'B', 'A']

...or a Series:

df.astype(bool).dot(df.columns)

0    A
1    C
2    B
3    A
dtype: object

Upvotes: 2

Related Questions