Reputation: 77
Here is my query that works:
SELECT a_name, w_name, page_url, SUM(IF(result = 'WIN', 1, 0)) as Impressions,
from TABLE_DATE_RANGE(v3_data.v3_,TIMESTAMP('2015-08-05'),TIMESTAMP('2015-08-07'))
WHERE server ="server1" and w_id IN (1096, 1105, 1136,1153,1189,1209)
GROUP EACH BY 1,2,3
ORDER BY Impressions DESC
limit 1000000
Now my issue is I dont want to use a limit.
What I would like to do is choose the top 1000 results from each w_id, ordered by the impressions.
There are multiple w_ids and this query will be extended to multiple servers too.
originally I thought id try:
select top 1000 * from [code above] group by 1,2,3
but top doesnt work like that in big query and if I did that it wouldnt choose the top 1000 for each w_id it would do it over the whole set, so I could get none of the results for w_id 1209 if they are all relatively small.
I hope this is enough information, I've been given a lot of stick on here for being to vague, but im trying to give as much detail and be as logical as I can. Thanks in advance
Upvotes: 0
Views: 4973
Reputation: 1269673
BigQuery supports ROW_NUMBER()
which is the function you need to do this easily.
You do need to include w_id
in the group by
, but I think the following should do what you want:
SELECT t.*
FROM (SELECT a_name, w_name, page_url,
SUM(IF(result = 'WIN', 1, 0)) as Impressions,
ROW_NUMBER() OVER (PARTITION BY w_id
ORDER BY SUM(IF(result = 'WIN', 1, 0)) DESC
) as seqnum
from TABLE_DATE_RANGE(v3_data.v3_, TIMESTAMP('2015-08-05'), TIMESTAMP('2015-08-07'))
WHERE server = 'server1' and w_id IN (1096, 1105, 1136, 1153, 1189, 1209)
GROUP EACH BY 1, 2, 3, w_id
) t
WHERE seqnum <= 1000;
If BigQuery doesn't support row_number()
in an aggregation, you might need an additional layer of aggregation.
Upvotes: 1