Reputation: 256
I'm wondering if there's an easy way to limit a query to the top n windows.
i.e. say I have something like
SELECT field1
,field2
,field3
,sum(field2) over (partition by field1) sum2
,sum(field3) over (partition by field1) sum3
FROM table1
GROUP BY field1, field2, field3
ORDER BY sum2 DESC LIMIT 100
The above query returns the top 100 records, not the top 100 windows (which makes sense)
What I want to get is the top 100 sum2 windows, even though there may be multiple rows inside that window. So I might be getting 400 records, but only the top 100 windows.
Hopefully that makes sense.
Upvotes: 3
Views: 2240
Reputation: 656331
After the comment and giving it some more thought, I think the following query does what you want.
I pick the first 100 "windows" resulting from the query and return all rows that fall in to those windows. As the windows are partitioned by field1
that's effectively the 100 distinct values of field1
with the greatest sum2
. For ties on sum2
the greater field1
wins in my query (you did not specify).
WITH x AS (
SELECT field1
,field2
,field3
,sum(field2) over w sum2
,sum(field3) over w sum3
FROM table1
GROUP BY field1, field2, field3
WINDOW w AS (PARTITION BY field1)
)
, y AS (
SELECT field1
FROM x
GROUP BY sum2, field1
ORDER BY sum2 DESC, field1 DESC
LIMIT 100
)
SELECT x.*
FROM y
JOIN x USING (field1)
ORDER BY sum2 DESC, field1 DESC, field2 DESC, field3 DESC;
The crucial point is to generate the aggregate values in a CTE, pick the 100 winning windows out of those in another CTE (could also be done with DISTINCT, I chose a GROUP BY
/ ORDER BY
), and join the result back to the first CTE to get all rows for those windows.
All in all this is quite a complex query.
Upvotes: 1