Reputation: 12550
I have the following data:
tag_id | taggable_id
--------+-------------
1 | 1
2 | 1
3 | 1
4 | 1
1 | 2
2 | 2
5 | 2
6 | 2
7 | 3
8 | 3
9 | 3
10 | 3
And I want to bring all the taggable_ids which are in a group of tag_ids AND in another group AND another... (max 4 groups).
For example:
In a gross way(for the second example), the query is the following:
SELECT taggable_id FROM taggings WHERE tag_id in (1)
INTERSECT
SELECT taggable_id FROM taggings WHERE tag_id in (6)
INTERSECT
SELECT taggable_id FROM taggings WHERE tag_id in (2,8)
I think simplifying it, it looks like:
SELECT taggable_id FROM taggings WHERE tag_id in (1,2,6,8)
GROUP BY taggable_id HAVING COUNT(*)=3
but I'm wondering if can be done in a simpler way. Any thoughts?
Upvotes: 1
Views: 766
Reputation: 656411
This can be cast as a case of relational division. We have assembled an arsenal of query techniques under this related question:
Depending on data distribution and other factors, this may be fastest:
SELECT DISTINCT taggable_id
FROM taggings t1
JOIN taggings t2 USING (taggable_id)
JOIN taggings t3 USING (taggable_id)
WHERE t1.tag_id = 1
AND t2.tag_id = 6
AND t3.tag_id IN (2, 8);
Assuming unique (tag_id, taggable_id)
, DISTINCT
is actually not needed for the example. But it might be necessary with other (list) predicates.
SQL Fiddle (building on @Clodoaldo's, thanks).
Upvotes: 3
Reputation: 125214
Your second query fails if the tuple (8, 2)
is inserted. Here is one solution although I don't know if simpler then the intersect
one:
select taggable_id
from taggings
where tag_id in (1,2,6,8)
group by taggable_id
having
array_agg(tag_id) @> array[1,2]
and
array_agg(tag_id) && array[6,8]
Upvotes: 2
Reputation: 1269583
Your second query is not equivalent to the first. Change the having
clause to:
HAVING SUM(CASE WHEN tag_id = 1 THEN 1 ELSE 0 END) > 0 AND
SUM(CASE WHEN tag_id = 6 THEN 1 ELSE 0 END) > 0 AND
SUM(CASE WHEN tag_id IN (2, 8) THEN 1 ELSE 0 END) > 0;
As for performance, test the queries to see which performs better. The advantage of the having
approach is you can add more and more complex conditions without having a significant impact on performance.
Upvotes: 1