Reputation: 36484
I have a problem similar to LIMITing a SQL JOIN, but with a slightly more complex requirement.
I want to search for Users and associated Transactions, which lie within a time range:
SELECT u.*, t.*
FROM User u
JOIN Transaction t ON t.user_id = u.id
WHERE t.timestamp >= ? and t.timestamp <= ?;
So far, so good. Now I want to repeat the query, but with a LIMIT on the number of users returned. There should be no limit on the number of transactions returned for a given user, though.
If I follow the approach suggested in the other question, this would translate into:
SELECT u.*, t.*
FROM (SELECT * FROM User LIMIT 10) u
JOIN Transaction t ON t.user_id = u.id
WHERE t.timestamp >= ? and t.timestamp <= ?;
This will not produce what I want: it will return the first 10 users, who might not have any transactions associated.
I want to return 10 users who have at least one associated transaction in the given time range.
How can I achieve this using MySQL?
Upvotes: 4
Views: 144
Reputation: 1269633
You can do this without variables, but it requires repeating the join
logic:
SELECT u.*, t.*
FROM (SELECT *
FROM User
WHERE EXISTS (SELECT 1
FROM Transaction t
WHERE t.user_id = u.id AND
t.timestamp >= ? and t.timestamp <= ?
)
LIMIT 10
) u JOIN
Transaction t
ON t.user_id = u.id
WHERE t.timestamp >= ? and t.timestamp <= ?;
EDIT:
Probably the fastest answer is something like this:
select u.*, t.*
from (select user_id
from (select user_id
from transaction t
where t.timestamp >= ? and t.timestamp <= ?
limit 1000
) t
limit 30
) tt join
user u
on tt.userid = u.id join
transaction t
on tt.userid = t.userid and t.timestamp >= ? and t.timestamp <= ?;
The first subquery chooses 1,000 matching records in the transaction table. My guess is that this is more than enough to get 30 users. This list is then joined to the user and transaction table to get the final results. By limiting the list without having to do a full table scan, the first query should be pretty fast . . . especially with an additional index on (timestamp, user)
.
Upvotes: 1
Reputation: 72165
You can use variables for this:
SELECT *
FROM (
SELECT *,
@rn := IF(@uid = user_id, @rn,
IF(@uid := user_id, @rn +1, @rn + 1)) AS rn
FROM (
SELECT u.*, t.*
FROM User u
JOIN Transaction t ON t.user_id = u.id
WHERE t.timestamp >= x and t.timestamp <= y) AS t
CROSS JOIN (SELECT @rn := 0, @uid := 0) AS vars
ORDER BY user_id) AS x
WHERE x.rn <= 10
Variable @rn
is incremented by 1 every time a new user is returned by the query. So we can control the number of users returned using @rn <= 10
.
Upvotes: 1