Reputation: 397
Let's say I have a dataframe like this
Fruit Color Weight
apple red 50
apple red 75
apple green 45
orange orange 80
orange orange 90
orange red 90
I would like to add a column with True or False according to the fact that Fruit and Color of row x are equal to Fruit and Color of row x+1, like this:
Fruit Color Weight Validity
apple red 50 True
apple red 75 False
apple green 45 False
orange orange 80 True
orange orange 90 False
orange red 90 False
I have tried the following, but there are some errors I guess, I am getting wrong results:
g['Validity'] = (g[['Fruit', 'Color']] == g[['Fruit', 'Color']].shift()).any(axis=1)
Upvotes: 2
Views: 2440
Reputation: 10624
Similarly with other answers:
df['Validity']=(df[['Fruit', 'Color']]==pd.concat([df['Fruit'].shift(-1), df['Color'].shift(-1)], axis=1)).all(axis=1)
>>> print(df)
Fruit Color Weight Validity
0 apple red 50 True
1 apple red 75 False
2 apple green 45 False
3 orange orange 80 True
4 orange orange 90 False
5 orange red 90 False
Upvotes: 0
Reputation: 14949
Another alternative -
subset_df = df[['Fruit','Color']].apply(''.join, axis=1)
df['Validity'] = np.where(subset_df == subset_df.shift(-1), True,False)
Upvotes: 0
Reputation: 402353
You had the right idea about shifted comparison, but you need to shift backwards so you compare the current row with the next one. Finally use an all
condition to enforce that ALL columns are equal in a row:
df['Validity'] = df[['Fruit', 'Color']].eq(df[['Fruit', 'Color']].shift(-1)).all(axis=1)
df
Fruit Color Weight Validity
0 apple red 50 True
1 apple red 75 False
2 apple green 45 False
3 orange orange 80 True
4 orange orange 90 False
5 orange red 90 False
Upvotes: 5