qts
qts

Reputation: 1044

Checking column values in python panda

How do I check whether the column values in a panda table are the same and create the result in a fourth column:

original

    red  blue  green
a   1    1     1
b   1    2     1
c   2    2     2

becomes:

   red blue green match
a  1   1    1     1
b  1   2    1     0
c  2   2    2     1

Originally I only had 2 columns and it was possible to achieve something similar by doing this:

df['match']=df['blue']-df['red']

but this won't work with 3 columns.

Your help is greatly appreciated!

Upvotes: 2

Views: 90

Answers (2)

Zero
Zero

Reputation: 76917

To make it more generic, compare row values on apply method.

Using set()

In [54]: df['match'] = df.apply(lambda x: len(set(x)) == 1, axis=1).astype(int)

In [55]: df
Out[55]:
   red  blue  green  match
a    1     1      1      1
b    1     2      1      0
c    2     2      2      1

Alternatively, use pd.Series.nunique to identify number of unique in row.

In [56]: (df.apply(pd.Series.nunique, axis=1) == 1).astype(int)
Out[56]:
a    1
b    0
c    1
dtype: int32

Or, use df.iloc[:, 0] for first column values and match it eq with df

In [57]: df.eq(df.iloc[:, 0], axis=0).all(axis=1).astype(int)
Out[57]:
a    1
b    0
c    1
dtype: int32

Upvotes: 2

jrjc
jrjc

Reputation: 21873

You can try this:

df["match"] = df.apply(lambda x: int(x[0]==x[1]==x[2]), axis=1)

where:

  • x[0]==x[1]==x[2] : test for the eaquality of the 3 first columns
  • axis=1: columns wise

Alternatively, you can call the column by their name too:

df["match"] = df.apply(lambda x: int(x["red"]==x["blue"]==x["green"]), axis=1)

This is more convenient if you have many column and that you want to compare a subpart of them without knowing their number.

If you want to compare all the columns, use John Galt's solution

Upvotes: 1

Related Questions