Set intersection/difference in SQL?

Question

I have two tables t1 and t2, each defining a multimap from id to word:

> select * from t1;
id word
1  foo
1  bar
2  baz
2  quux

and

> select * from t2;
id word
1  foo
1  baz
3  baz

What I want is to find out the sizes of the union and intersection of the word sets for each id:

id  t1_union_t2 t1 t2 t2_minus_t1  t1_minus_t2 t1_intersect_t2
1   3           2  2  1            1           1
2   2           2  0  0            2           0
3   1           0  1  1            0           0

Obviously, the columns are not independent, e.g.,

t1_union_t2 = t1 + t2_minus_t1 = t1 + t2 - t1_intersect_t2

I want all of them for consistency checking only.

peterm · Accepted Answer

One way to do it with FULL JOIN

SELECT COALESCE(t1.id, t2.id) id,
       COUNT(*) t1_union_t2,
       COUNT(t1.id) t1,
       COUNT(t2.id) t2,
       SUM(CASE WHEN t1.id IS NULL THEN 1 ELSE 0 END) t2_minus_t1,
       SUM(CASE WHEN t2.id IS NULL THEN 1 ELSE 0 END) t1_minus_t2,
       SUM(CASE WHEN t1.id = t2.id THEN 1 ELSE 0 END) t1_intersect_t2
  FROM t1 FULL JOIN t2
    ON t1.id = t2.id
   AND t1.word = t2.word
 GROUP BY COALESCE(t1.id, t2.id);

Output:

| ID | T1_UNION_T2 | T1 | T2 | T2_MINUS_T1 | T1_MINUS_T2 | T1_INTERSECT_T2 |
|----|-------------|----|----|-------------|-------------|-----------------|
|  1 |           3 |  2 |  2 |           1 |           1 |               1 |
|  2 |           2 |  2 |  0 |           0 |           2 |               0 |
|  3 |           1 |  0 |  1 |           1 |           0 |               0 |

Here is a SQLFiddle demo

Set intersection/difference in SQL?

Answers (2)

Related Questions