Reputation: 23
Assuming that I have multiple sets of reports from multiple individuals. How do I identify the changes between these datas and decide which to merge to a specific database.
Scenario where applicable
Data 1
Date Sales Revenue
2016-01-01 27 30
2016-01-03 12 10
2016-01-04 48 50
Data 2
Date Sales Revenue
2016-01-01 27 10
2016-01-02 31 40
2016-01-04 48 50
Desired Outcome
Date Sales T1 Revenue T2 Revenue
2016-01-01 27 30 10
2016-01-02 31 NULL 40
2016-01-03 12 10 NULL
2016-01-04 48 50 50
I have tried various method including a combination of UNION and JOIN, nothing seem to work for me right now.
The closest I got right now is the following.
SELECT d1.date,
d1.sales,
d1.revenue AS T1,
d2.revenue AS T2
FROM dataset1 d1
RIGHT JOIN dataset2 d2 ON d1.date = d2.date
WHERE d1.revenue <> d2.revenue
OR (d1.revenue IS NOT NULL AND d2.revenue IS NULL)
OR (d1.revenue IS NULL AND d2.revenue IS NOT NULL)
The jump between left join / right join will only work depending on which side has the missing data.
Searched the site but haven't found a solution working for me =/
Upvotes: 1
Views: 2018
Reputation: 49260
You should use a full join
.
SELECT coalesce(d1.date,d2.date) dt,
coalesce(d1.sales,d2.sales) sales,
d1.revenue AS T1Revenue,
d2.revenue AS T2Revenue
FROM dataset1 d1
FULL JOIN dataset2 d2 ON d1.date = d2.date
Use coalesce
to get the non-null value for a column when it is not present in either of the given tables.
As MySQL doesn't support full join
, this can be done with a combination of left
and right
joins with a union
combining the results.
SELECT d1.date dt,
d1.sales sales,
d1.revenue AS T1Revenue,
d2.revenue AS T2Revenue
FROM dataset1 d1
LEFT JOIN dataset2 d2 ON d1.date = d2.date
UNION
SELECT d2.date dt,
d2.sales sales,
d1.revenue AS T1Revenue,
d2.revenue AS T2Revenue
FROM dataset1 d1
RIGHT JOIN dataset2 d2 ON d1.date = d2.date
ORDER BY 1
Upvotes: 0
Reputation: 33935
SELECT x.*
, d1.revenue t1_revenue
, d2.revenue t2_revenue
FROM (SELECT date, sales FROM data1
UNION
SELECT date, sales FROM data2
) x
LEFT
JOIN data1 d1
ON d1.date = x.date
LEFT
JOIN data2 d2
ON d2.date = x.date
ORDER
BY date;
Upvotes: 1
Reputation: 1269513
One method is brute force:
select 'd1' as which, d1.*
from data1 d1
where not exists (select 1
from data2 d2
where d1.date = d2.date and d1.revenue <=> d2.revenue
)
union all
select 'd2' as which, d2.*
from data1 d2
where not exists (select 1
from data1 d1
where d1.date = d2.date and d1.revenue <=> d2.revenue
);
Your sample query only compares revenue
, but you can use the same logic to compare sales
. Note that <=>
is the NULL
-safe comparison operator.
Upvotes: 0