Reputation: 13
I am attempting to find the top n records when grouped by multiple attributes. I believe it is related to this problem, but I am having difficulty adapting the solution described to my situation.
To simplify, I have a table with columns (did is short for device_id):
id int
did int
dateVal dateTime
I am trying to find the top n device_id's for each day with the most rows.
For example (ignoring id and the time part of dateTime),
did dateVal
1 2017-01-01
1 2017-01-01
1 2017-01-01
2 2017-01-01
3 2017-01-01
3 2017-01-01
1 2017-01-02
1 2017-01-02
2 2017-01-02
2 2017-01-02
2 2017-01-02
3 2017-01-02
Finding the top 2 would yield...
1, 2017-01-01
3, 2017-01-01
2, 2017-01-02
1, 2017-01-02
My current naive approach is only giving me the top 2 across all dates.
--Using SQLite
select date(dateVal) || did
from data
group by date(dateVal), did
order by count(*) desc
limit 2
I'm using the concatenation operator so that I can later extract the rows.
I am using SQLite, but any general SQL explanation would be appreciated.
Upvotes: 1
Views: 2651
Reputation: 297
I tested the query using sql server
select top 2 did, dateVal
from (select *, count(*) as c
from test
group by did,dateVal) as t
order by t.c desc
Upvotes: -1
Reputation: 2466
Similarly to this question, define a CTE that computes all device counts for your desired groups, then use it in a WHERE ... IN
subquery, limited to the top 2 devices for that date:
WITH device_counts AS (
SELECT did, date(dateval) AS dateval, COUNT(*) AS device_count
FROM data
GROUP BY did, date(dateval)
)
SELECT did, date(dateval) FROM device_counts DC_outer
WHERE did IN (
SELECT did
FROM device_counts DC_inner
WHERE DC_inner.dateval = DC_outer.dateval
GROUP BY did, date(dateval)
ORDER BY DC_inner.device_count DESC LIMIT 2
)
ORDER BY date(dateval), did
Upvotes: 3