Reputation: 1044
How do I check whether the column values in a panda table are the same and create the result in a fourth column:
original
red blue green
a 1 1 1
b 1 2 1
c 2 2 2
becomes:
red blue green match
a 1 1 1 1
b 1 2 1 0
c 2 2 2 1
Originally I only had 2 columns and it was possible to achieve something similar by doing this:
df['match']=df['blue']-df['red']
but this won't work with 3 columns.
Your help is greatly appreciated!
Upvotes: 2
Views: 90
Reputation: 76917
To make it more generic, compare row values on apply method.
Using set()
In [54]: df['match'] = df.apply(lambda x: len(set(x)) == 1, axis=1).astype(int)
In [55]: df
Out[55]:
red blue green match
a 1 1 1 1
b 1 2 1 0
c 2 2 2 1
Alternatively, use pd.Series.nunique
to identify number of unique in row.
In [56]: (df.apply(pd.Series.nunique, axis=1) == 1).astype(int)
Out[56]:
a 1
b 0
c 1
dtype: int32
Or, use df.iloc[:, 0]
for first column values and match it eq
with df
In [57]: df.eq(df.iloc[:, 0], axis=0).all(axis=1).astype(int)
Out[57]:
a 1
b 0
c 1
dtype: int32
Upvotes: 2
Reputation: 21873
You can try this:
df["match"] = df.apply(lambda x: int(x[0]==x[1]==x[2]), axis=1)
where:
x[0]==x[1]==x[2]
: test for the eaquality of the 3 first columnsaxis=1
: columns wiseAlternatively, you can call the column by their name too:
df["match"] = df.apply(lambda x: int(x["red"]==x["blue"]==x["green"]), axis=1)
This is more convenient if you have many column and that you want to compare a subpart of them without knowing their number.
If you want to compare all the columns, use John Galt's solution
Upvotes: 1