Reputation: 36846
Let's say we have a books website with more than 100.000 books and 1.000.000 rows of tags.
Users will frequently search for books that have some tags they like and at the same time don't have tags that they don't like.
What will be a best way to do search for frequent users requests?
Let's say user want books with tags 15 and 25 (book should have 2 tags, not any of) and user don't wan't books with tags 50 and 99 and ordered by rating. For common sense we will LIMIT results to 5 and use OFFSET for more.
books:
id | rating
1 | 5
2 | 5
3 | 1
tags_books:
book_id | tag_id
1 | 15
1 | 25
1 | 50
2 | 15
2 | 25
P.S. One of solutions was to do request with having sum, but for big table with frequent requests it will be slow as i understand it:
select b.id from books b
left join tags_books tb on tb.book_id = b.id
group by b.id
having sum(case when tb.tag_id in (1,2,3) then 1 else 0 end) >= 2
and sum(case when tb.tag_id in (11,12,13) then 1 else 0 end) = 0
ORDER BY b.rating LIMIT 5 OFFSET 0
Upvotes: 0
Views: 41
Reputation: 1269793
For this purpose, I would recommend exists
and not exists`:
selet b.*
from books b
where exists (select 1 from tags_books tb where tb.book_id = b.id and tb.tag_id = 15
) and
exists (select 1 from tags_books tb where tb.book_id = b.id and tb.tag_id = 25
) and
not exists (select 1 from tags_books tb where tb.book_id = b.id and tb.tag_id in (50, 99)
) ;
For performance, you want an index on tags_books(book_id, tag_id)
.
If you phrase this as aggregation, I would recommend:
select bt.book_id
from book_tags bt
where bt.tag_id in (15, 25, 50, 99)
group by bt.book_id
having count(*) filter (where bt.tag_id in (15, 25)) = 2 and
count(*) filter (where bt.tag_id in (50, 99)) = 0;
Upvotes: 1