Checking column values in python panda

Question

How do I check whether the column values in a panda table are the same and create the result in a fourth column:

original

    red  blue  green
a   1    1     1
b   1    2     1
c   2    2     2

becomes:

   red blue green match
a  1   1    1     1
b  1   2    1     0
c  2   2    2     1

Originally I only had 2 columns and it was possible to achieve something similar by doing this:

df['match']=df['blue']-df['red']

but this won't work with 3 columns.

Your help is greatly appreciated!

Zero · Accepted Answer

To make it more generic, compare row values on apply method.

Using set()

In [54]: df['match'] = df.apply(lambda x: len(set(x)) == 1, axis=1).astype(int)

In [55]: df
Out[55]:
   red  blue  green  match
a    1     1      1      1
b    1     2      1      0
c    2     2      2      1

Alternatively, use pd.Series.nunique to identify number of unique in row.

In [56]: (df.apply(pd.Series.nunique, axis=1) == 1).astype(int)
Out[56]:
a    1
b    0
c    1
dtype: int32

Or, use df.iloc[:, 0] for first column values and match it eq with df

In [57]: df.eq(df.iloc[:, 0], axis=0).all(axis=1).astype(int)
Out[57]:
a    1
b    0
c    1
dtype: int32

Checking column values in python panda

Answers (2)

Related Questions