enterML
enterML

Reputation: 2285

Getting column name where a condition matches in a row

I have a pandas dataframe which looks like this:

        A     B     C     D     E     F     G     H     I
1       0.0   1.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0
2       1.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0
3       0.0   1.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0

Now, for each row, I have to check which column contains 1 and then record this column name in a new column. The final dataframe would look like this:

        A     B     C     D     E     F     G     H     I     IsTrue
1       0.0   1.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   B
2       1.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   A
3       0.0   1.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0   B

Is there any faster and pythonic way to do it?

Upvotes: 2

Views: 190

Answers (2)

yatu
yatu

Reputation: 88226

Here's one way using DataFrame.dot:

df['isTrue'] = df.astype(bool).dot(df.columns)

    A    B    C    D    E    F    G    H    I    isTrue
1  0.0  1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0      B
2  1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0      A
3  0.0  1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0      B

For an even better performance you can use:

df['isTrue'] = df.columns[df.to_numpy().argmax(1)]

Upvotes: 3

rafaelc
rafaelc

Reputation: 59274

What you described is the definition of idxmax

>>> df.idxmax(1)
1    B
2    A
3    B
dtype: object

Upvotes: 0

Related Questions