Reputation: 1190
I have 2 identical dataframes, we can use this as an example.
import pandas as pd
import numpy as np
data = {'name': ['Sheldon', 'Penny', 'Amy', 'Penny', 'Raj', 'Sheldon'],
'episodes': [42, 24, 31, 29, 37, 40],
'gender': ['male', 'female', 'female', 'female', 'male', 'male']}
data1 = {'name': ['Sheldon', 'Penny', 'Amy', 'Penny', 'Raj', 'Sheldon'],
'episodes': [12, 32, 31, 32, 37, 40],
'gender': ['male', 'female', 'female', 'female', 'male', 'male']}
df1 = pd.DataFrame(data1, columns = ['name','episodes', 'gender'])
df = pd.DataFrame(data, columns = ['name','episodes', 'gender'])
for names in df['name']:
if (df[df['name'].str.contains(f'{names}')]['episodes']).any() == (df1[df1['name'].str.contains(f'{names}')]['episodes']).any():
print('True')
else:
print('False')
It is checking if the number of episodes are different between the two dataframes and should print false
where they are different episodes. But I am getting all True
True
True
True
True
True
True
Why is it not printing false?
Upvotes: 1
Views: 70
Reputation: 323266
We can just try merge
df.merge(df1,on='name',how='left').eval('episodes_x==episodes_y')
Upvotes: 1
Reputation: 120419
Use set_index
then compare episodes
columns:
>>> df.set_index('name')['episodes'] == df1.set_index('name')['episodes']
name
Sheldon False
Penny False
Amy True
Penny False
Raj True
Sheldon True
Name: episodes, dtype: bool
Upvotes: 3
Reputation: 43
I think you meant to put data1 into df1? Right now you have created df1 and df both from data alone. Also, you don't really need to go through it row by row. (df == df1)['episodes'] should give you what you want.
Upvotes: 0
Reputation: 1624
you can use .eq()
method:
print(df.episodes.eq(df1.episodes))
0 False
1 False
2 True
3 False
4 True
5 True
Name: episodes, dtype: bool
Upvotes: 2