Reputation: 35928
I have two large tables. Both containing around 17M rows each. They should have same exact number of rows but I am finding that the counts are different by 343. I want to find out where the counts are different. Tables look like this:
Table A
ID | color
---| ---------
1 | red
1 | green
1 | blue
2 | white
3 | black
3 | red
Tabls B
ID | sale_dates
---| ----------
1 | 2020-10-01
1 | 2020-01-10
2 | 2018-01-09
3 | 2017-08-08
Based on above I would like an output like below:
ID | Table A | Table B | Difference
---| --------| --------| ----------
1 | 5 | 2 | 3
2 | 1 | 1 | 0
3 | 2 | 1 | 1
Or even only find out the ones where the difference is not 0
Upvotes: 0
Views: 172
Reputation: 1269563
This is a tweak on Nick's answer. I think a full join
is very important in this type of situation, because it is possible that some ids are missing from one table or the other:
SELECT ID, a.cnt, b.cnt,
(COALESCE(a.cnt, 0) - COALESCE(b.cnt, 0)) as difference
FROM (SELECT UPPER(ID) as id, COUNT(*) AS cnt
FROM A
GROUP BY UPPER(ID)
) A FULL JOIN
(SELECT UPPER(ID) as id, COUNT(*) AS cnt
FROM B
GROUP BY UPPER(ID)
) B
USING (ID)
ORDER BY difference DESC;
Add:
WHERE COALESCE(a.cnt, 0) <> COALESCE(b.cnt)
if you only want ids where the counts are not the same.
Upvotes: 1
Reputation: 147146
If the two tables will always have the same set of ID
values, you can just JOIN
two derived tables of COUNT(*)
values to get your desired output:
SELECT A.ID,
"Table A",
"Table B",
"Table A" - "Table B" AS Difference
FROM (
SELECT ID, COUNT(*) AS "Table A"
FROM A
GROUP BY ID
) A
JOIN (
SELECT ID, COUNT(*) AS "Table B"
FROM B
GROUP BY ID
) B ON A.ID = B.ID
ORDER BY A.ID
Output:
id Table A Table B difference
1 3 2 1
2 1 1 0
3 2 1 1
If you only want the ID
values which have a non-zero difference, add
WHERE "Table A" - "Table B" > 0
before the ORDER BY
clause.
Upvotes: 3