helloB
helloB

Reputation: 3582

Postgres: return first N rows per group for groups having a minimum countI

I copied the logic of this helpful post Grouped LIMIT in PostgreSQL: show the first N rows for each group? to get the first N rows by group, ordered by creation date. However, I'm really only interested in this for high-volume users with some minimum row count. Does this have to be done as a set of two distinct queries or can I modify my query below?

SELECT userid, createdat, displaydate
FROM 
     ( SELECT ROW_NUMBER() OVER 
         (PARTITION BY userid ORDER BY createdat) as r, 
          t.* FROM data t) x 
WHERE x.r <=100

In particular I tried the following, which led to an error:

SELECT userid, createdat, displaydate
FROM 
     ( SELECT ROW_NUMBER() OVER 
         (PARTITION BY userid ORDER BY createdat) as r, 
          t.* FROM data t) x 
WHERE x.r <=100
HAVING COUNT(*) > 100

Ideally I'd get the first 100 rows per user for those users who have at least 100 rows of data. How can I achieve this?

Also I suspect that a single query may not be the most efficient way to do this even if it's achievable. What would be the 'best practice' way to do this (with all the provisos that I'd really need to test on my own system of course)

Upvotes: 2

Views: 619

Answers (2)

Madhivanan
Madhivanan

Reputation: 13700

Try this

SELECT userid, createdat, displaydate
FROM 
     ( SELECT ROW_NUMBER() OVER 
         (PARTITION BY userid ORDER BY createdat) as r,count() OVER 
         (PARTITION BY userid) as counting
          t.* FROM data t) x 
WHERE x.r <=100 and counting > 100

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1271151

You can also do the count in the subquery using another window function:

SELECT userid, createdat, displaydate
FROM (SELECT t.*,
             ROW_NUMBER() OVER (PARTITION BY userid ORDER BY createdat) as seqnum, 
             COUNT(*) OVER (PARTITION BY userid) as cnt
      FROM data t
     ) x 
WHERE x.seqnum <= 100 AND x.cnt > 100;

Upvotes: 2

Related Questions