madkitty
madkitty

Reputation: 1675

Select count distinct and common rows from two tables

I have two tables, T1 and T2 as follow :

CATEGORY     ID
1           1100
1           1200
1           1300
1           1500 
2           2000
2           2100
2           2300
2           2500

I need to know :

I'm bunching my head on it since this morning, and tried to do that to get the similar rows:

select count(*) from T1, T2 WHERE 
T1.CATEGORY = T2.CATEGORY AND T1.ID = T2.ID; 

But I can't figure out how to get unique rows (only in T1 or T2).

Upvotes: 0

Views: 4331

Answers (3)

wildplasser
wildplasser

Reputation: 44240

DROP SCHEMA tmp CASCADE;
CREATE SCHEMA tmp ;
SET search_path=tmp;

CREATE TABLE lutser
        ( id INTEGER NOT NULL
        , category INTEGER NOT NULL
        );
INSERT INTO lutser(category, id) VALUES
(1,1100) ,(1,1200) ,(1,1300) ,(1,1500)
,(2,2000) ,(2,2100) ,(2,2300) ,(2,2500)
,(1,3500) -- added these
,(2,3500)
        ;

These queries construct a "bit mask" 1 for category==1, 2 for category==2, and add them up. So the mask is 3 when the id is present in both sets, 1 when only in the first set, and 2 when only in the second set. The outer join + coalesce do the trick here.

        --
        -- CTE version
        --
WITH flags AS (
        WITH one AS ( SELECT category AS flag , id FROM lutser WHERE category = 1)
        , two AS ( SELECT category AS flag , id FROM lutser WHERE category = 2)
        SELECT COALESCE(one.flag, 0) + COALESCE(two.flag, 0) AS flag
        FROM one
        FULL OUTER JOIN two ON two.id = one.id
        )
SELECT flag, COUNT(*)
FROM flags
GROUP BY flag;

        --
        -- Non-CTE version
        --
SELECT COALESCE(one.flag, 0) + COALESCE(two.flag, 0) AS flags
        , COUNT(*)
FROM (
        SELECT category AS flag , id
        FROM lutser WHERE category = 1
        ) one
FULL OUTER JOIN (
        SELECT category AS flag , id
        FROM lutser WHERE category = 2
        ) two ON two.id = one.id
GROUP BY flags;

Result (for both queries ;-):

 flags | count 
-------+-------
     1 |     4
     2 |     4
     3 |     1

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269623

If you cannot assume that the rows are distinct, then you need to take a slightly different approach. Here is a method that answers all three questions at the same time, taking into account duplicate rows:

select (case when isT1 = 1 and isT2 = 0 then 'BOTH'
             when isT1 = 1 then 'T1-Only'
             else 'T2-Only'
        end) as WhereRow,
       count(*) as NumDistinctRows,
       sum(cnt) as NumTotalRows
from ((select category, id, count(*) as cnt, 1 as isT1, 0 as isT2
       from t1
       group by category, id
      ) union all
      (select category, id, count(*) as cnt, 0 as isT1, 1 as isT2
       from t2
       group by category, id
      )
     ) t
group by isT1, isT2

Upvotes: 0

John Woo
John Woo

Reputation: 263703

Question 1

SELECT  COUNT(*) totalCount
FROM    T1 a 
        INNER JOIN T2 b
            ON a.Category = b.Category AND
                a.ID = b.ID

Question 2 (use LEFT JOIN)

SELECT  COUNT(*) totalCount
FROM    T2 a 
        LEFT JOIN T1 b
            ON a.Category = b.Category AND
                a.ID = b.ID
WHERE   b.Category IS NULL

Question 3 (use LEFT JOIN)

SELECT  COUNT(*) totalCount
FROM    T1 a 
        LEFT JOIN T2 b
            ON a.Category = b.Category AND
                a.ID = b.ID
WHERE   b.Category IS NULL

Upvotes: 5

Related Questions