Reputation: 21
The following query is running into a Resource exceeded during query execution error on bigquery. The data processed is around 700 MB , which is not that much , as displayed on the bigquery console. We are using group each by because on using group by we get the same error along with the suggestion to use group each by. The associated job id is
Job ID: fast-chess-620:job_41Fq1q3zFGB3FsACtuAiymTOCIU (group each by)
Job ID: fast-chess-620:job_VVd2jPGX-nHsdZW5GlEU6bBgpnU (group by)
select col_1,col_2, count(col_3) from
(select col_1, col_2, col_3 from
(select col_1, date(sec_to_timestamp(col_4)) as col_2, count(col_5) as col_3 from
(TABLE_DATE_RANGE(table_prefix_1_,
date_add(usec_to_timestamp(utc_usec_to_month(now())), -6, "MONTH"),
date_add(usec_to_timestamp(utc_usec_to_month(now())), -1, "MONTH"))),
(TABLE_DATE_RANGE([table_prefix_2_],
usec_to_timestamp(utc_usec_to_month(now())),
usec_to_timestamp(utc_usec_to_day(now()))))
group each by 1,2 order by 1,2) x) x
group each by 1,2 order by 1,2
Can you please help us resolve the issue.
Upvotes: 1
Views: 611
Reputation: 26637
It looks like the issue is the ORDER BY. See the stackoverflow response here.
Looking at the logs for your query, the group by produces more than 15 million results. In order to sort them, bigquery must perform the sort operation in a single node.
Do you really need a sorted result? If you do, do you need all of the results? If you use an ORDER BY with a LIMIT, it should succeed, since it can just keep the top values.
You should be able to get this query to run successfully if you:
Upvotes: 3