Ken
Ken

Reputation: 23

MySQL - Getting differences between two sets of data on two table using join?

Assuming that I have multiple sets of reports from multiple individuals. How do I identify the changes between these datas and decide which to merge to a specific database.

Scenario where applicable

Data 1

Date        Sales   Revenue
2016-01-01  27      30
2016-01-03  12      10
2016-01-04  48      50

Data 2

Date        Sales   Revenue
2016-01-01  27      10
2016-01-02  31      40
2016-01-04  48      50

Desired Outcome

Date        Sales   T1 Revenue  T2 Revenue
2016-01-01  27      30          10
2016-01-02  31      NULL        40
2016-01-03  12      10          NULL
2016-01-04  48      50          50

I have tried various method including a combination of UNION and JOIN, nothing seem to work for me right now.

The closest I got right now is the following.

SELECT d1.date,
       d1.sales,
       d1.revenue AS T1,
       d2.revenue AS T2

FROM dataset1 d1

RIGHT JOIN dataset2 d2 ON d1.date = d2.date

WHERE d1.revenue <> d2.revenue
OR (d1.revenue IS NOT NULL AND d2.revenue IS NULL)
OR (d1.revenue IS NULL AND d2.revenue IS NOT NULL)

The jump between left join / right join will only work depending on which side has the missing data.

Searched the site but haven't found a solution working for me =/

Upvotes: 1

Views: 2018

Answers (3)

Vamsi Prabhala
Vamsi Prabhala

Reputation: 49260

You should use a full join.

SELECT coalesce(d1.date,d2.date) dt,
       coalesce(d1.sales,d2.sales) sales,
       d1.revenue AS T1Revenue,
       d2.revenue AS T2Revenue
FROM dataset1 d1
FULL JOIN dataset2 d2 ON d1.date = d2.date

Use coalesce to get the non-null value for a column when it is not present in either of the given tables.

As MySQL doesn't support full join, this can be done with a combination of left and right joins with a union combining the results.

SELECT d1.date dt,
       d1.sales sales,
       d1.revenue AS T1Revenue,
       d2.revenue AS T2Revenue
FROM dataset1 d1
LEFT JOIN dataset2 d2 ON d1.date = d2.date
UNION
SELECT d2.date dt,
       d2.sales sales,
       d1.revenue AS T1Revenue,
       d2.revenue AS T2Revenue
FROM dataset1 d1
RIGHT JOIN dataset2 d2 ON d1.date = d2.date
ORDER BY 1

Upvotes: 0

Strawberry
Strawberry

Reputation: 33935

SELECT x.*
     , d1.revenue t1_revenue
     , d2.revenue t2_revenue
  FROM (SELECT date, sales FROM data1
         UNION
        SELECT date, sales FROM data2
       ) x
  LEFT
  JOIN data1 d1
    ON d1.date = x.date
  LEFT
  JOIN data2 d2
    ON d2.date = x.date
 ORDER
    BY date;

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269513

One method is brute force:

select 'd1' as which, d1.*
from data1 d1
where not exists (select 1
                  from data2 d2
                  where d1.date = d2.date and d1.revenue <=> d2.revenue
                 )
union all
select 'd2' as which, d2.*
from data1 d2
where not exists (select 1
                  from data1 d1
                  where d1.date = d2.date and d1.revenue <=> d2.revenue
                 );

Your sample query only compares revenue, but you can use the same logic to compare sales. Note that <=> is the NULL-safe comparison operator.

Upvotes: 0

Related Questions