Reputation: 3582
I copied the logic of this helpful post Grouped LIMIT in PostgreSQL: show the first N rows for each group? to get the first N rows by group, ordered by creation date. However, I'm really only interested in this for high-volume users with some minimum row count. Does this have to be done as a set of two distinct queries or can I modify my query below?
SELECT userid, createdat, displaydate
FROM
( SELECT ROW_NUMBER() OVER
(PARTITION BY userid ORDER BY createdat) as r,
t.* FROM data t) x
WHERE x.r <=100
In particular I tried the following, which led to an error:
SELECT userid, createdat, displaydate
FROM
( SELECT ROW_NUMBER() OVER
(PARTITION BY userid ORDER BY createdat) as r,
t.* FROM data t) x
WHERE x.r <=100
HAVING COUNT(*) > 100
Ideally I'd get the first 100 rows per user for those users who have at least 100 rows of data. How can I achieve this?
Also I suspect that a single query may not be the most efficient way to do this even if it's achievable. What would be the 'best practice' way to do this (with all the provisos that I'd really need to test on my own system of course)
Upvotes: 2
Views: 619
Reputation: 13700
Try this
SELECT userid, createdat, displaydate
FROM
( SELECT ROW_NUMBER() OVER
(PARTITION BY userid ORDER BY createdat) as r,count() OVER
(PARTITION BY userid) as counting
t.* FROM data t) x
WHERE x.r <=100 and counting > 100
Upvotes: 0
Reputation: 1271151
You can also do the count in the subquery using another window function:
SELECT userid, createdat, displaydate
FROM (SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY userid ORDER BY createdat) as seqnum,
COUNT(*) OVER (PARTITION BY userid) as cnt
FROM data t
) x
WHERE x.seqnum <= 100 AND x.cnt > 100;
Upvotes: 2