Reputation: 874
I'm sure there is an answer present for this question but bear with me as I'm new to SQL and am not sure how to ask the question.
I have data like this (this is shorthand purely for example). This is in a postgres db.
table1
id value
1 111
1 112
1 113
2 111
2 112
2 116
3 111
3 122
3 123
4 126
5 123
5 125
6 111
6 112
6 116
table2
value
111
112
116
I need return the id of table1 where all values in table2 exist in the values of table1. So for this example, my query would return 2 and 6.
Is there any way to do this in SQL? Or could you possibly guide me to a data structure that would allow for me to get this result? I am able to change up the structure of either table to accommodate the ultimate need of getting this result
Thank you so much. An answer to this would be a life saver.
Upvotes: 4
Views: 8865
Reputation: 44250
NOT EXISTS(... NOT EXISTS) is a standard solution to relational division:
SELECT DISTINCT id
FROM table1 t1
WHERE NOT EXISTS (
SELECT * FROM table2 t2
WHERE NOT EXISTS (
SELECT * FROM table1 t1x
WHERE t1x.value = t2.value
AND t1x.id = t1.id
)
)
;
In this case, the DISTINCT
is needed because we don't have access to the domain table with id
s, only to the junction table t1
referring to it.
Upvotes: 0
Reputation: 57023
It seems to me that as much as anything you want to know how to ask the right question. The magic words here are "relational division".
It is one of the operators in Codd's relational algebra and there have been several variations proposed since. Most recently, Chris Date has proposed replacing the whole concept with image relations.
SQL has no explicit divide operator. There are a number of workarounds using other operator and the most appropriate will depend on your requirements, including exact division or division with remainder and how to handle an empty divisor. Then there are the usual considerations: SQL product and version, performance, personal style and taste, etc.
Here are a couple of articles which should help you with these choices:
On Making Relational Division Comprehensible
Divided We Stand: The SQL of Relational Division
Upvotes: 3
Reputation:
UPDATE Another possibility:
SELECT t1.id
FROM (SELECT t1.id, t1.value
FROM table1 t1
JOIN table2 t2 USING (value)
GROUP BY t1.id, t1.value
ORDER BY t1.id) t1
GROUP BY t1.id
HAVING COUNT(*) = (SELECT COUNT(*) FROM table2)
The cost of my answer, if you use EXPLAIN ANALYZE is always 893-900, even with repeated rows.
Upvotes: 1
Reputation: 656844
Consider this demo:
CREATE TEMP TABLE table1(id int, value int);
INSERT INTO table1 VALUES
(1,111),(1,112),(1,113)
,(2,111),(2,112),(2,116)
,(3,111),(3,122),(3,123)
,(4,126)
,(5,123),(5,125)
,(6,111),(6,112),(6,116);
CREATE TEMP TABLE table2(value int);
INSERT INTO table2 VALUES
(111)
,(112)
,(116);
SELECT t1.id
FROM table1 t1
JOIN table2 t2 USING (value)
GROUP BY t1.id
HAVING count(*) = (SELECT count(*) FROM table2)
ORDER BY t1.id;
Result:
id
-----
2
6
Returns all ids of table1
that appear with all values provided by table2
once.
Works for any number of rows in both tables.
If duplicate rows can appear in table1
make that:
HAVING count(DISTINCT value) = (SELECT count(*) FROM table2)
Upvotes: 6