Bounty Collector
Bounty Collector

Reputation: 635

How to compare two data frames with same columns but different number of rows?

df1=

  A   B  C  D

  a1  b1 c1 1

  a2  b2 c2 2

  a3  b3 c3 4

df2=

  A   B  C  D

  a1  b1 c1 2

  a2  b2 c2 1

I want to compare the value of the column 'D' in both dataframes. If both dataframes had same number of rows I would just do this.

newDF = df1['D']-df2['D']

However there are times when the number of rows are different. I want a result Dataframe which shows a dataframe like this.

resultDF=

  A   B  C  D_df1 D_df2  Diff

  a1  b1 c1  1     2       -1

  a2  b2 c2  2     1        1

EDIT: if 1st row in A,B,C from df1 and df2 is same then and only then compare 1st row of column D for each dataframe. Similarly, repeat for all the row.

Upvotes: 3

Views: 10719

Answers (1)

Andy L.
Andy L.

Reputation: 25239

Use merge and df.eval

df1.merge(df2, on=['A','B','C'], suffixes=['_df1','_df2']).eval('Diff=D_df1 - D_df2')

Out[314]:
    A   B   C  D_df1  D_df2  Diff
0  a1  b1  c1      1      2    -1
1  a2  b2  c2      2      1     1

Upvotes: 3

Related Questions