Reputation: 7391
I have a table with a composite primary key (ID
, Date
) like below.
+------+------------+-------+ | ID | Date | Value | +------+------------+-------+ | 1 | 1433419200 | 15 | | 1 | 1433332800 | 23 | | 1 | 1433246400 | 41 | | 1 | 1433160000 | 55 | | 1 | 1432900800 | 24 | | 2 | 1433419200 | 52 | | 2 | 1433332800 | 23 | | 2 | 1433246400 | 39 | | 2 | 1433160000 | 22 | | 3 | 1433419200 | 11 | | 3 | 1433246400 | 58 | | ... | ... | ... | +------+------------+-------+
There is also a separate index on Date
column. The table is of moderate size, currently ~600k row and growing by ~2k everyday.
I want to do a single SELECT query that returns the latest 3 records (ordered by Date
timestamp) for each ID
. For each given ID
, the Date
values are always unique, so no need to worry about ties for Date
here.
I've tried a self-join approach, inspired by this answer, but it took quite a few seconds to run and returned nothing:
SELECT p1.ID, p1.Date, p1.Value FROM MyTable AS p1
LEFT JOIN MyTable AS p2
ON p1.ID=p2.ID AND p1.Date<=p2.Date
GROUP BY p1.ID
HAVING COUNT(*)<=5
ORDER BY p1.ID, p1.Date DESC;
What would be a fast solution here?
Upvotes: 6
Views: 19565
Reputation: 1
SELECT distinct x.ID,x.Date,X.Value
FROM ( SELECT DISTINCT ID FROM XXXTable ) c
CROSS APPLY (
select top 3 A.ID,a.Date,Value,[Count] from (
SELECT distinct ID,Date,Value, ROW_NUMBER()
over (
PARTITION BY ID
order by Date
) AS [Count] where c.ID = t.ID
) A order by [Count] desc
Upvotes: 0
Reputation: 180070
You could look up the three most recent dates for each ID:
SELECT ID, Date, Value
FROM MyTable
WHERE Date IN (SELECT Date
FROM MyTable AS T2
WHERE T2.ID = MyTable.ID
ORDER BY Date DESC
LIMIT 3)
Alternatively, look up the third most recent date for each ID, and use it as a limit:
SELECT ID, Date, Value
FROM MyTable
WHERE Date >= IFNULL((SELECT Date
FROM MyTable AS T2
WHERE T2.ID = MyTable.ID
ORDER BY Date DESC
LIMIT 1 OFFSET 2),
0)
Both queries should get good performance from the primary key's index.
Upvotes: 14
Reputation: 1269873
First, here is the correct query for the inequality method:
SELECT p1.ID, p1.Date, p1.Value
FROM MyTable p1 LEFT JOIN
MyTable AS p2
ON p1.ID = p2.ID AND p2.Date <= p1.Date
--------------------------^ fixed this condition
GROUP BY p1.ID, p1.Date, p1.Value
HAVING COUNT(*) <= 5
ORDER BY p1.ID, p1.Date DESC;
I'm not sure if there is a fast way to do this in SQLite. In most other databases, you can use the ANSI standard row_number()
function. In MySQL, you can use variables. Both of these are difficult in SQLite. Your best solution may be to use a cursor.
The above can benefit from an index on MyTable(Id, Date)
.
Upvotes: 3