Reputation: 53
I am looking for the best possible way to re-write a query that uses the IN() clause. I'd like to sort the results by products that have the most total matches in products_adjectives and products_interests.
I have 5 tables.
- products
title, price, description, etc.
- interests
interest_id, interest_name
- adjectives
adjective_id, adjective_name
- products_interests
product_id, interest_id
- products_adjectives
product_id, adjective_id
I'm currently using this query to return any product that matches any adjective or interest. Here's an example:
SELECT *
FROM products
LEFT JOIN products_adjectives
ON products_adjectives.product_id = products.id
LEFT JOIN products_interests
ON products_interests.product_id = products.id
WHERE products_adjectives.adjective_id IN (3,6,9,12,13) OR products_interests.interest_id IN (2,5,8,12,16,18)
GROUP BY products.id ORDER BY ABS(products.price)
Currently, any product that exists in any of the adjective_ids or interest_ids is returned. That query will return any products that have any adjective_id 3,6,9,12,13
or any interest_id 2,5,8,12,16,18
I'd like to continue returning all results, but instead I'd like them sorted by which products contain the most matches of adjectives or interests. So, if a product has adjectives 3,6,9,10,11
I'd like that ahead of a product that has adjective 3,4,5,7,8
Any thoughts or direction would be appreciated. If I'm going about this entirely wrong (with the IN() clause) please let me know!
Thanks!
Upvotes: 2
Views: 93
Reputation: 1269703
You have a problem with your query because you are generating a cartesian product between the interests and adjectives for a given product.
The simplest way to fix this is to count the distinct matches. So, the following order by
clause should do what you want:
ORDER BY (count(distinct interest_id) + count(distinct adjective_id)) desc,
abs(product.price)
You could fix this problem in other ways, such as counting the interests in a subquery and counting the adjectives in a subquery. That would be the recommended approach, particularly if the lists were to become large. (Say, 30 matching interests and 30 matching adjectives would generate an intermediate table with almost one thousand rows for the product.)
Upvotes: 0
Reputation: 6275
I'm thinking something like the below. The key idea you were missing is ordering by count(*) DESC
The downside of this query is that no products will be pulled if there are no matching records in the interests or adjectives tables. However, since I think you are pulling things based on their relevance, you should not want products with zero relevance.
SELECT *, count(*) as relevance
FROM products, products_adjectives, products_interests
WHERE products.id = products_interests.product_id
AND products.id = products_adjectives.product_id
AND (interest_id IN (14, 22, 78) OR adjective_id IN (8, 17, 26))
GROUP BY products.id
ORDER BY count(*) DESC, price ASC
See fiddle: http://sqlfiddle.com/#!3/838571/2
Upvotes: 1