Reputation: 387
Basically I'm trying to create a "suggestions" page based on user interests.
In a accumulative table I store all the products seen by every user. My idea was to select all seen products of all people that have visited products seen by me. I've been trying to come up with a query for awhile now but my best idea was going
Thing is, I think doing it with few queries is not scalable if my table grows. With the subqueries I had the query destroyed my database even though when i used EXPLAIN everything seemed fine (no temp table, no disk hits) yet when I raw the query it took over a minute to execute ... in a table with ~40k rows. Running the queries one by one did what I wanted in under a second so I'm really baffled. Where am I doing a mistake?
The table has the following columns id (PRIMARY), user_id, product_id and bunch of unneeded fields
The following is the SQL query I came up with (that kills my server)
SELECT product_id
FROM user_behavior
WHERE user_id
IN (
SELECT user_id
FROM user_behavior
WHERE user_id <> 43456
AND product_id
IN (
SELECT product_id
FROM user_behavior
WHERE user_id =43456
GROUP BY product_id
)
AND offer_city_id
IN ( 0, 2 )
)
As I said running explain returns the following
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
|---- |-------------------- |--------------- |---------------- |------------------- |--------- |--------- |------- |------- |-------------------------- |
| 1 | PRIMARY | user_behavior | index | NULL | user_id | 8 | NULL | 25800 | Using where; using index |
| 2 | DEPENDENT SUBQUERY | user_behavior | index_subquery | user_id,user_id_2 | user_id | 4 | func | 3 | Using where |
| 3 | DEPENDENT SUBQUERY | user_behavior | ref | user_id,user_id_2 | user_id | 4 | const | 76 | Using where; using index |
EDIT: I am sorry, I can't visualize a table :(
Upvotes: 0
Views: 477
Reputation: 142356
Don't use IN ( SELECT ... )
.
I am quite lost in what the query is trying to do, but switching to JOIN
and EXISTS
is likely to be part of the solution. Perhaps something close to this:
SELECT s.product_id
FROM user_behavior AS a
JOIN user_behavior AS b ON b.user_id = a.user_id
WHERE EXISTS (
SELECT *
FROM user_behavior
WHERE product_id = b.product_id
AND user_id = 43456
)
AND b.offer_city_id IN ( 0, 2 )
AND b.user_id <> 43456
And include a 'composite' INDEX(user_id, product_id)
(in either order)
Or maybe only this is needed??...
SELECT DISTINCT product_id
FROM user_behavior AS b
WHERE EXISTS (
SELECT *
FROM user_behavior
WHERE product_id = b.product_id
AND user_id = 43456
)
AND offer_city_id IN ( 0, 2 )
AND user_id <> 43456
Well, I hope you get some ideas from these attempts.
Upvotes: 1