hrishi
hrishi

Reputation: 433

python DataFrame comparison using == operator

I have two python dataframes of same structure and same number of rows when I perform '==' operation on them they gives wrong answers

df1:

      0     61561899
      1     56598947
      2     52231204
      3     10069030
      4     19900179
      5     52892001
      6     50015534
      7     10071207
      8     55455545
      9     10075649
      10    52050196
 Name: spn, dtype: object

df2:

  0     61561899
  1     56598947
  2     52231204
  3     10069030
  4     19900179
  5     52892001
  6     50015534
  7     10071207
  8     55455545
  9     10075649
  10    52050196
  Name: spn, dtype: object

print df1 == df2
the above python statement gives following output:

  0     False
  1     False
  2     False
  3     False
  4     False
  5     False
  6     False
  7     False
  8     False
  9     False
  10    False
  Name: spn, dtype: bool

I dont know what am I missing. I am expecting all true.

Upvotes: 0

Views: 1312

Answers (3)

Donald S
Donald S

Reputation: 1753

Would like to see your code as the comparison gives True values. Here is an example using your data.

import pandas as pd
df1 = pd.DataFrame({'A': [61561899,
       56598947,
       52231204,
       10069030,
       19900179,
       52892001,
       50015534,
       10071207,
       55455545,
       10075649,
       52050196]})

df2 = pd.DataFrame({'A': [61561899,
       56598947,
       52231204,
       10069030,
       19900179,
       52892001,
       50015534,
       10071207,
       55455545,
       10075649,
       52050196]})

print(df1 == df2)

Out[1]: 
       A
0   True
1   True
2   True
3   True
4   True
5   True
6   True
7   True
8   True
9   True
10  True

df1.dtypes

Out[2]:
A    int64
dtype: object

Upvotes: 0

shweta kapgate
shweta kapgate

Reputation: 37

You can also compare two data frames using equals. Visit http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.equals.html

df1 = pd.DataFrame({'A': [1, 2, 3], 'B': ['a', 'b', 'f']})
df2 = pd.DataFrame({'A': [1, 2, 3], 'B': ['a', 'b', 'f']})
df1.equals(df2) 
output: 
df1.equals(df2)
Out[70]: True

It will give you True boolean value if both dataframes are equal

Also, you can use isin. It return boolean of dataframes. for ex:

df1 = pd.DataFrame({'A': [1, 2, 3], 'B': ['a', 'b', 'f']})
df2 = pd.DataFrame({'A': [1, 2, 3], 'B': ['a', 'b', 'f']})
df1.isin(df2)
output: 
df1.isin(df2)
Out[68]: 
      A     B
0  True  True
1  True  True
2  True  True

Visit http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.isin.html

Upvotes: 1

jezrael
jezrael

Reputation: 863166

Try cast to str and then compare:

df1.spn.astype(str) == df2.spn.astype(str)

Or maybe need compare columns only:

df1.spn == df2.spn

Upvotes: 1

Related Questions