Reputation: 4594
I have requests logs in a Bigquery dataset with the following simplified format:
timestamp | text | machine_type
2021-06-03 12:49:22.198412 UTC | <request>...</request> | small
2021-06-03 12:46:21.198412 UTC | <request>...</request> | medium
2021-06-03 12:52:29.198412 UTC | Connection intialization | medium
2021-06-03 12:12:12.198412 UTC | Error: xxx | big
I need to get the second with the maximum number of requests per second for every type of machine there is and which second was the one with the highest load. Something like:
timestamp | maximum_number_of_requests | machine_type
2021-06-03 12:46:21.198412 UTC | 232 | medium
2021-02-03 11:29:12.193311 UTC | 15 | small
I don't find a way to group by requests, count them and max them, while including the timestamp at the same time.
Upvotes: 0
Views: 118
Reputation: 1269773
This is a good place to use qualify
:
select timestamp_trunc(timestamp, second) as timestamp,
count(*) as number_of_requests,
machine_type
from mytable
group by 1, 3
qualify row_number() over (partition by machine_type order by count(*) desc) = 1;
Upvotes: 1
Reputation: 10172
Try this one:
select array_agg(t order by number_of_requests desc limit 1)[offset(0)].*
from (
select
timestamp_trunc(timestamp, second) as timestamp,
count(*) as number_of_requests,
machine_type
from mytable
group by 1, 3
) t
group by t.machine_type
Upvotes: 1