Reputation: 397

Pandas Dataframe: how can i compare values in two columns of a row are equal to the ones in the same columns of a subsequent row?

Let's say I have a dataframe like this

Fruit  Color  Weight
apple   red    50
apple   red    75
apple  green   45
orange orange  80
orange orange  90
orange  red    90

I would like to add a column with True or False according to the fact that Fruit and Color of row x are equal to Fruit and Color of row x+1, like this:

Fruit  Color  Weight Validity
apple   red    50      True
apple   red    75      False
apple  green   45      False
orange orange  80      True
orange orange  90      False
orange  red    90      False

I have tried the following, but there are some errors I guess, I am getting wrong results:

g['Validity'] = (g[['Fruit', 'Color']] == g[['Fruit', 'Color']].shift()).any(axis=1)

Upvotes: 2

Answers (3)

IoaTzimas

Reputation: 10624

Similarly with other answers:

df['Validity']=(df[['Fruit', 'Color']]==pd.concat([df['Fruit'].shift(-1), df['Color'].shift(-1)], axis=1)).all(axis=1)

>>> print(df)
       Fruit   Color  Weight  Validity
0   apple     red      50      True
1   apple     red      75     False
2   apple   green      45     False
3  orange  orange      80      True
4  orange  orange      90     False
5  orange     red      90     False

Upvotes: 0

Nk03

Reputation: 14949

Another alternative -

subset_df = df[['Fruit','Color']].apply(''.join, axis=1)
df['Validity'] = np.where(subset_df == subset_df.shift(-1), True,False)

Upvotes: 0

cs95

Reputation: 402353

You had the right idea about shifted comparison, but you need to shift backwards so you compare the current row with the next one. Finally use an all condition to enforce that ALL columns are equal in a row:

df['Validity'] = df[['Fruit', 'Color']].eq(df[['Fruit', 'Color']].shift(-1)).all(axis=1)

df
    Fruit   Color  Weight  Validity
0   apple     red      50      True
1   apple     red      75     False
2   apple   green      45     False
3  orange  orange      80      True
4  orange  orange      90     False
5  orange     red      90     False

Upvotes: 5

Pandas Dataframe: how can i compare values in two columns of a row are equal to the ones in the same columns of a subsequent row?

Answers (3)

Related Questions