osm
osm

Reputation: 622

SQL - LIMIT by one column in GROUP BY

In trying to make a query in which I pull the top 1000 posts (by time_spent) per specified tag, I came up with the following query, where 1, 2, 3 are the specified tags:

SELECT g.tagid, e.post_id, SUM(e.time_spent) AS time
  FROM   post_table e
  JOIN (SELECT g.postid, g.tagid
          FROM   tags_table g
          WHERE  g.tagid IN (1, 2, 3)) g
       ON e.post_id = g.postid
  WHERE dt >= '2018-06-01'
  GROUP BY g.tagid, e.post_id
  ORDER BY time DESC
  LIMIT  1000

However, the problem with using LIMIT 1000 here is that it limits on the entire group and makes it so I only obtain 1000 results in total instead of obtaining 1000 results for each of tag 1, tag 2, and tag 3 (i.e. 3000 results in total).

How can I modify this query such that the LIMIT only occurs on the e.post_id component of the GROUP BY? Or alternatively, is there another way of obtaining 1000 results per each of the tags specified in the IN clause?

Upvotes: 0

Views: 33

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1271131

Use row_number():

SELECT ge.*
FROM (SELECT g.tagid, e.post_id, SUM(e.time_spent) AS time,
             ROW_NUMBER() OVER (PARTITION BY g.tagid ORDER BY SUM(e.time_spent) ) as seqnum
      FROM post_table e JOIN 
           tags_table g
           ON e.post_id = g.postid
      WHERE  g.tagid IN (1, 2, 3) AND dt >= '2018-06-01'
      GROUP BY g.tagid, e.post_id
     ) ge
WHERE seqnum <= 1000
ORDER BY t.tagid, time DESC

Upvotes: 1

Related Questions