Gublooo
Gublooo

Reputation: 2618

Sql Query - Limiting query results

I am quite certain we cannot use the LIMIT clause for what I want to do - so wanted to find if there are any other ways we can accomplish this.

I have a table which captures which user visited which store. Every time a user visits a store, a row is inserted into this table.

Some of the fields are

Now what I want is - for a given set of stores, find the top 5 users who have visited the store max number of times.

I can do this 1 store at a time as:

select store_id,user_id,count(1) as visits 
from shopping 
where store_id = 60
group by user_id,store_id 
order by visits desc Limit 5

This will give me the 5 users who have visited store_id=60 the max times

What I want to do is provide a list of 10 store_ids and for each store fetch the 5 users who have visited that store max times

select store_id,user_id,count(1) as visits 
from shopping 
where store_id in  (60,61,62,63,64,65,66)
group by user_id,store_id 
order by visits desc Limit 5
This will not work as the Limit at the end will return only 5 rows rather than 5 rows for each store.

Any ideas on how I can achieve this. I can always write a loop and pass 1 store at a time but wanted to know if there is a better way

Upvotes: 7

Views: 985

Answers (5)

Patrick
Patrick

Reputation: 15717

Using two user variable and counting the same consecutive store_id, you can replace <= 5 with whatever limit you want

SELECT a.*
FROM (
 SELECT store_id, user_id, count(1) as visits 
 FROM shopping
 WHERE store_id IN (60,61,62,63,64,65,66)
 GROUP BY store_id, user_id
 ORDER BY store_id, visits desc, user_id
) a,
(SELECT @prev:=-1, @count:=1) b
WHERE
 CASE WHEN @prev<>a.store_id THEN
   CASE WHEN @prev:=a.store_id THEN
    @count:=1
   END
 ELSE
   @count:=@count+1
 END <= 5

Edit as requested some explanation :

The first subquery (a) is the one that group and order the data so you will have data like:

store_id | user_id | visits
---------+---------+-------
 60           1       5
 60           2       3
 60           3       1
 61           2       4
 61           3       2

the second subquery (b) init the user variable @prev with -1 and @count with 1

then we choose all data from the subquery (a) verifying the condition in the case.

  • verify that the previous store_id (@prev) we have seen is different from the current store_id. Since the first @prev is equal to -1 there is nothing that match the current store_id so the condition <> is true we enter then is the second case who just serve to change the value @prev with the current store_id. This is the trick so i can change the two user variable @count and @prev in the same condition.

  • if the previous store_id is equal to @prev just increment the @count variable.

  • we check that the count is within the value we want so the <= 5

So with our test data the:

step | @prev | @count | store_id | user_id | visits
-----+-------+--------+----------+---------+-------
  0      -1      1    
  1      60      1        60          1        5 
  2      60      2        60          2        3
  3      60      3        60          3        1
  4      61      1        61          2        4
  5      61      2        61          3        2   

Upvotes: 3

Tobias
Tobias

Reputation: 737

If you will not save data about when a user visited a store or something like this, you could simply update the table each time a user visits a store instead of appending a new row.

Something like this:

INSERT INTO `user_store` (`user_id`, `store_id`, `visits`) VALUES ('USER', 'SHOP', 1)
ON DUPLICATE KEY UPDATE `visits` = `visits` + 1

But I think this would not work, because neither user_id nor store_id are unique. You have to add a unique primary key like this: user#store or something else.

Another opinion would be to save this data (how often a user was in a store) in a separate table containing of ID, user_id, store_id, visits and increment visits everytime you also add a new row to you existing table.

To get the Top5 you can then use:

SELECT `visits`, `user_id` FROM `user_store_times` WHERE `store_id`=10 ORDER BY `visits` DESC LIMIT 5

Upvotes: 1

Rahul Prasad
Rahul Prasad

Reputation: 8222

Major concern over here is number of times you query a database. If you query multiple times from your script. Its simply wastage of resources and must be avoided. That is you must NOT run a loop to run the SQL multiple times by incrementing certain value. In your case 60 to 61 and so on.

Solution 1: Create a view Here is the solution

CREATE VIEW myView AS
select store_id,user_id,count(1) as visits 
from shopping 
where store_id = 60
group by user_id,store_id 
order by visits desc Limit 5
UNION
select store_id,user_id,count(1) as visits 
from shopping 
where store_id = 61
group by user_id,store_id 
order by visits desc Limit 5
UNION
select store_id,user_id,count(1) as visits 
from shopping 
where store_id = 62
group by user_id,store_id
order by visits desc Limit 5 

Now use

SELECT * from MyView

This is limited because you cant make it dynamic. What if you need 60 to 100 instead of 60 to 66.

Solution 2: Use Procedure. I wont go into how to write a procedure cuz its late night and I got to sleep. :) Well, procedure must accept two values 1st inital number (60) and 2nd Count (6) Inside the procedure create a temporary table (cursor) to store data then run a loop from initial number till count times In your case from 60 to 66 Inside the loop write desired script Replacing 60 with a looping variable.

select store_id,user_id,count(1) as visits 
from shopping 
where store_id = 60
group by user_id,store_id 
order by visits desc Limit 5

And append the result in the temporary table (cursor).

Hope this will solve your problem. Sorry I couldn't give you the code. If you still need it plz send me a message. I will give it to you when I wake up next morning.

Upvotes: 2

DisgruntledGoat
DisgruntledGoat

Reputation: 72530

The simplest way would be to issue 10 separate queries, one for each store. If you use parameterized queries (e.g. using PDO in PHP) this will be pretty fast since the query will be part-compiled.

If this still proves to be too resource-intensive, then another solution would be to cache the results in the store table - i.e. add a field that lists the top 5 users for each store as a simple comma-separated list. It does mean your database would not be 100% normalised but that shouldn't be a problem.

Upvotes: 0

Yada
Yada

Reputation: 31225

UNION may be what you are looking for.

-- fist store
(select store_id,user_id,count(1) as visits 
from shopping 
where store_id = 60
group by user_id,store_id 
order by visits desc Limit 5)
UNION ALL
-- second store
(select store_id,user_id,count(1) as visits 
from shopping 
where store_id = 61
group by user_id,store_id 
order by visits desc Limit 5)
...

http://dev.mysql.com/doc/refman/5.0/en/union.html

Upvotes: 1

Related Questions