Phil Freeman
Phil Freeman

Reputation: 256

PostgreSQL LIMIT over WINDOW

I'm wondering if there's an easy way to limit a query to the top n windows.

i.e. say I have something like

SELECT field1
      ,field2
      ,field3
      ,sum(field2) over (partition by field1) sum2
      ,sum(field3) over (partition by field1) sum3
FROM table1
GROUP BY field1, field2, field3
ORDER BY sum2 DESC LIMIT 100

The above query returns the top 100 records, not the top 100 windows (which makes sense)

What I want to get is the top 100 sum2 windows, even though there may be multiple rows inside that window. So I might be getting 400 records, but only the top 100 windows.

Hopefully that makes sense.

Upvotes: 3

Views: 2240

Answers (1)

Erwin Brandstetter
Erwin Brandstetter

Reputation: 656331

After the comment and giving it some more thought, I think the following query does what you want.

I pick the first 100 "windows" resulting from the query and return all rows that fall in to those windows. As the windows are partitioned by field1 that's effectively the 100 distinct values of field1 with the greatest sum2. For ties on sum2 the greater field1 wins in my query (you did not specify).

WITH x AS (
    SELECT field1
          ,field2
          ,field3
          ,sum(field2) over w sum2
          ,sum(field3) over w sum3
    FROM   table1
    GROUP  BY field1, field2, field3
    WINDOW w AS (PARTITION BY field1) 
    )
    , y AS (
    SELECT field1
    FROM   x
    GROUP  BY sum2, field1
    ORDER  BY sum2 DESC, field1 DESC
    LIMIT  100
    )
SELECT x.*
FROM   y
JOIN   x USING (field1)
ORDER  BY sum2 DESC, field1 DESC, field2 DESC, field3 DESC;

The crucial point is to generate the aggregate values in a CTE, pick the 100 winning windows out of those in another CTE (could also be done with DISTINCT, I chose a GROUP BY / ORDER BY), and join the result back to the first CTE to get all rows for those windows.

All in all this is quite a complex query.

Upvotes: 1

Related Questions