Reputation: 1019
I'd like to know the most efficient SQL query for achieving this problem:
Say we have a table with two columns, one storing entry ids (entry_id
) and one storing category ids (cat_id
):
entry_id cat_id
3 1
3 2
3 3
3 20
4 1
4 2
4 21
I'd like to count how many distinct entry_id
's there are in the categories 1, 2 OR 3 but that also must be in cat_id
20.
For example, categories 1, 2 and 3 might represent music genres (Country, Pop etc.), while category 20 might be recording formats (CD, Vinyl etc.). So another way of putting it verbally could be: "How many products are there that are on Vinyl and in either the Pop or Country category?"
I could achieve this with a nested loop in code (PHP) or possibly with a nested SQL subquery, but neither feels that efficient. I feel there must be an obvious answer to this staring me in the face...
EDIT TO ADD:
I would also like to do this without modifying the database design, as it's a third party system.
FURTHER EXAMPLE TO CLARIFY:
Another real-world example of why I'd need this data:
Let's say the category ids instead represent either:
OR
Let's say someone has selected that they are interested in camping (cat_id
= 1). Now we need to count how many camping products there are in the Europe. A product might be tagged as both Europe (parent), UK (child) AND England (grand-child), giving us an array of category ids 1, 2 or 3. So we now need to count how many distinct products there are in both those categories AND the original accommodation category of 1 (camping).
So having selected Camping, the end result might look something like:
Hope that helps...
Upvotes: 1
Views: 322
Reputation: 11963
With no subqueries, using JOIN
and GROUP BY
:
Join the table to itself using entry_id (this gives you all possible pairs of cat_id for that entry_id). Select rows having cat_id both a member of (1,2,3) and the second cat_id = 20.
SELECT r1.entry_id
FROM records r1
JOIN records r2 USING(entry_id)
WHERE r1.cat_id IN (1,2,3)
AND r2.cat_id = 20 GROUP BY entry_id;
Upvotes: 1
Reputation: 23123
I believe you want GROUP BY, COUNT() and EXISTS()
declare @t table(entry_id int, cat_id int)
insert @t select 1, 1
insert @t select 2, 1
insert @t select 1, 2
insert @t select 2, 2
insert @t select 3, 1
insert @t select 1, 20
select t1.cat_id, COUNT(*)
from @t as t1
where exists(
select * from @t
where t1.entry_id = entry_id
and cat_id = 20)
group by t1.cat_id
V2 using join instead of EXISTS()
declare @t table(entry_id int, cat_id int)
insert @t select 1, 1
insert @t select 2, 1
insert @t select 1, 2
insert @t select 2, 2
insert @t select 3, 1
insert @t select 1, 20
select t1.cat_id, COUNT(*)
from @t as t1
join @t as t2 on t1.entry_id = t2.entry_id and t2.cat_id = 20
group by t1.cat_id
Upvotes: 1
Reputation: 6003
select count(distinct entry_id) from myTable where cat_id=20 and entry_id in
(select distinct entry_id from myTable where cat_id in (1,2,3));
Upvotes: 1