Boolean Logic Comparing Data From Two Pandas Dataframes

Question

I have the following dataframe df1:

          Date  Invoice          Name  Price  Coupon Location
0   2017-12-24   700349      John Doe  59.95    NONE    VAGG1
1   2017-12-24   700347     Joe Smith  59.95    GBMR       GG
2   2017-12-24   700345  Dave Johnson  35.00  CHANGE    VAGG1
3   2017-12-24   700342     Sue Davis  35.00  GADSLR    VAGG1
4   2017-12-23   700329   Betty Clark  84.95  GADSLR      GG2

and a second dataframe df2:

           Date  Invoice         Name  Price    Coupon    Location
0   2017-12-24    800349     John Doe  59.95   NONE      VAGG1
1   2017-12-24    800347    Joe Smith  59.95   GBMR      GG
2   2017-12-24    800345     John Doe  17.95   CHANGE    VAGG1
3   2017-12-24    800342     John Doe   9.95   GADSLR    VAGG1
4   2017-12-23    800329  Sue Simpson  34.95   GADSLR    GG2

I would like to create a third Dataframe, df3, using the following logic.

For each name in df1, check to see if there is a match.
If there is a match, add the matching row from df2 to df3, provided that the price for that row does not match the price associated with that name if df1.

So the output dataframe, df3, should appear as follows:

+------------+---------+----------+-------+--------+----------+
|    Date    | Invoice |   Name   | Price | Coupon | Location |
+------------+---------+----------+-------+--------+----------+
| 2017-12-24 |  800345 | John Doe | 17.95 | CHANGE | VAGG1    |
| 2017-12-24 |  800342 | John Doe |  9.95 | GADSLR | VAGG1    |
+------------+---------+----------+-------+--------+----------+

cs95 · Accepted Answer

Using merge + query -

df1.merge(df2[['Name', 'Price']], on='Name')\
   .query('Price_x != Price_y')\
   .drop('Price_x', 1)\
   .rename(columns={'Price_y' : 'Price'})

         Date  Invoice      Name Coupon Location  Price
1  2017-12-24   700349  John Doe   NONE    VAGG1  17.95
2  2017-12-24   700349  John Doe   NONE    VAGG1   9.95

Where df1 and df2 are your respective dataframes.

Boolean Logic Comparing Data From Two Pandas Dataframes

Answers (2)

Related Questions