Reputation:
I have a pandas
df
that contains 4 different columns
. For every row
theres a value
thats of importance. I want to return the Column name
where that value
is displayed. So for the df
below I want to return the Column
name when the value 2 is labelled.
d = ({
'A' : [2,0,0,2],
'B' : [0,0,2,0],
'C' : [0,2,0,0],
'D' : [0,0,0,0],
})
df = pd.DataFrame(data=d)
Output:
A B C D
0 2 0 0 0
1 0 0 2 0
2 0 2 0 0
3 2 0 0 0
So it would be A,C,B,A
I'm doing this via
m = (df == 2).idxmax(axis=1)[0]
And then changing the row. But this isn't very efficient.
I'm also hoping to produce the output as a Series
from pandas df
Upvotes: 1
Views: 117
Reputation: 402922
Use DataFrame.dot
:
df.astype(bool).dot(df.columns).str.cat(sep=',')
Or,
','.join(df.astype(bool).dot(df.columns))
'A,C,B,A'
Or, as a list:
df.astype(bool).dot(df.columns).tolist()
['A', 'C', 'B', 'A']
...or a Series:
df.astype(bool).dot(df.columns)
0 A
1 C
2 B
3 A
dtype: object
Upvotes: 2