Reputation: 244
I have two SQL queries where uses an inner join first to match based on a condition, and the other does not. Ultimately, I would like the difference between the columns created by each query. How can I do this?
I have tried unioning and joining the queries as in some similar posts, but it won't work. I wonder if the issue is around the joins within each query.
Query 1 :
SELECT A.date, COUNT(DISTINCT A.id)
FROM A
INNER JOIN B
ON A.id = B.id AND A.date = B.date
AND B.col1 = 'value1'
LEFT JOIN C on C.key = A.key
WHERE A.col1 = 'value2'
AND C.category = 'cat1'
GROUP BY 1
ORDER BY 1 DESC
Query 2 :
SELECT A.date, COUNT(DISTINCT A.id)
FROM A
LEFT JOIN C on C.key = A.key
WHERE A.col1 = 'value2'
AND C.category = 'cat1'
GROUP BY 1
ORDER BY 1 DESC
Upvotes: 1
Views: 83
Reputation: 37472
Your left join of c
is actually turned to an inner join because it's used in a NULL
excluding expression in the WHERE
clause. So you can directly inner join c
and left join b
. Then you can use a case in one count()
to count only the instances where a row from b
was joined. Subtract that value from another count()
counting all occurrences to get difference.
SELECT a.date,
count(DISTINCT a.id)
-
count(DISTINCT CASE
WHEN b.id IS NOT NULL THEN
a.id
END)
FROM a
INNER JOIN c
ON c.key = a.key
AND c.category = 'cat1'
LEFT JOIN b
ON a.id = b.id
AND a.date = b.date
AND b.col1 = 'value1'
WHERE a.col1 = 'value2'
GROUP BY 1
ORDER BY 1 DESC;
Upvotes: 1
Reputation: 222402
A simple way is to JOIN
the two queries, using the date
column, which is available in both queries :
SELECT x.date, x.cnt, y.cnt, y.cnt - x.cnt
FROM
(
SELECT A.date, COUNT(DISTINCT A.id) AS cnt
FROM A
INNER JOIN B ON A.id = B.id AND A.date = B.date AND B.col1 = 'value1'
LEFT JOIN C on C.key = A.key
WHERE A.col1 = 'value2' AND C.category = 'cat1'
GROUP BY 1
) AS x
INNER JOIN (
SELECT A.date, COUNT(DISTINCT A.id) AS cnt
FROM A
LEFT JOIN C on C.key = A.key
WHERE A.col1 = 'value2' AND C.category = 'cat1'
GROUP BY 1
) AS y ON x.date = y.date
ORDER BY 1 DESC
You might want to adapt the join type according to your data layout :
LEFT JOIN
if all dates are available in the first subquery but may be missing in the second subqueryRIGHT JOIN
if the situation is the other way aroundFULL OUTER JOIN
if you want all available dates from both endsIf you choose any of the above option, you would need to use COALESCE
to prevent the substraction to return NULL
when one of the terms is NULL
.
Upvotes: 1
Reputation: 4165
SELECT A.date, COUNT(DISTINCT A.id)
FROM A
INNER JOIN B
ON A.id = B.id AND A.date = B.date
AND B.col1 = 'value1'
LEFT JOIN C on C.key = A.key
WHERE A.col1 = 'value2'
AND C.category = 'cat1'
GROUP BY 1
ORDER BY 1 DESC
UNION
SELECT A.date, COUNT(DISTINCT A.id)
FROM A
LEFT JOIN C on C.key = A.key
WHERE A.col1 = 'value2'
AND C.category = 'cat1'
GROUP BY 1
ORDER BY 1 DESC
Upvotes: 1