Reputation: 1352
In my BQ standardsql query when i use few analytic functions ( https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#analytic-functions ) i get this error:
Resources exceeded during query execution. For more information, see https://cloud.google.com/bigquery/troubleshooting-errors
The query has few fields calculated more or less similar to this:
case when 1 = ROW_NUMBER() over (partition by Y,m,operatingSystem)
then count(distinct case when IsNewVisit = 1 then fullvisitorid else null end)
over (partition by Y,m,operatingSystem)
else null end as NewUniqueVisitorsMonthlyOS
When i split the query and run each part one by one they all work well. However, I don't want to split the query into multiple ones because i need to have all the fields in one final BQ View.
Is there any way i can fix this error?
UPD: Here is an example of the query. When i add more fields it stops working with the above mentioned error.
SELECT
distinct
Date
,channelGrouping
,country
,browser
,deviceCategory
,operatingSystem
#Visits by all dimensions
,count(distinct concat(fullvisitorid,cast(visitid as string)))
over (partition by concat(Y,m,d),channelGrouping,country,browser,deviceCategory,operatingSystem)
as Visits
#Daily Users Browser
,case when 1 = ROW_NUMBER() over (partition by Y,m,d,browser)
then count (distinct fullvisitorid)
over (partition by Y,m,d,browser)
else null end as UniqueVisitorsDailyBrowser
#Weekly New Users
,case when 1 = ROW_NUMBER() over (partition by Y,U,channelGrouping)
then count(distinct case when IsNewVisit = 1 then fullvisitorid else null end)
over (partition by Y,U,channelGrouping)
else null end as NewUniqueVisitorsWeeklyChannel
#Monthly New Users
,case when 1 = ROW_NUMBER() over (partition by Y,m,operatingSystem)
then count(distinct case when IsNewVisit = 1 then fullvisitorid else null end)
over (partition by Y,m,operatingSystem)
else null end as NewUniqueVisitorsMonthlyOS
FROM GA_Export_Schema
Upvotes: 0
Views: 681
Reputation: 11797
What I usually do when I need to answer several different questions with a single query is I try to use some UNION ALL
operation with some key to distinguish the data.
I tested this query in our dataset based on your query:
SELECT
date date,
country country,
channel channel,
browser browser,
cat cat,
os os,
MAX(all_keys_visits) all_keys_visits,
MAX(browser_visits) browser_visits,
MAX(week_new_channel_users) week_new_channel_users,
MAX(month_new_os_users) month_new_os_users from(
SELECT
date,
country,
channel,
browser,
cat,
os,
visits AS all_keys_visits,
MAX(CASE
WHEN tmp = 'browser' THEN visits END) OVER(PARTITION BY browser) browser_visits, MAX(CASE
WHEN tmp = 'weekly_new_users' THEN visits END) OVER(PARTITION BY channel, week_date) week_new_channel_users,
MAX(CASE
WHEN tmp = 'monthly_new_users' THEN visits END) OVER(PARTITION BY os, month_date) month_new_os_users from(
SELECT
tmp,
date,
week_date,
month_date,
country,
channel,
browser,
cat,
os,
visits FROM (
SELECT
'all_visitors' AS tmp,
date,
FORMAT_DATE("%W", parse_DATE('%Y%m%d', date)) week_date,
FORMAT_DATE("%m", parse_DATE('%Y%m%d', date)) month_date,
geonetwork.country country,
channelGrouping channel,
device.browser browser,
device.devicecategory cat,
device.operatingSystem os,
COUNT(DISTINCT CONCAT(fullvisitorid, CAST(visitid AS string))) visits
FROM `project_id.dataset_id.ga_sessions*`
WHERE 1 = 1
AND REGEXP_EXTRACT(_table_suffix, r'.*_(.*)') BETWEEN FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)) AND FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE(), INTERVAL 0 DAY)) GROUP BY tmp, date, channel, country, browser, cat, os )
UNION ALL (
SELECT
'browser' AS tmp,
date,
FORMAT_DATE("%W", parse_DATE('%Y%m%d', date)) week_date,
FORMAT_DATE("%m", parse_DATE('%Y%m%d', date)) month_date,
'' country,
'' channel,
device.browser browser,
'' cat,
'' os,
COUNT(DISTINCT CONCAT(fullvisitorid, CAST(visitid AS string))) visits
FROM `project_id.dataset_id.ga_sessions*` WHERE 1 = 1 AND REGEXP_EXTRACT(_table_suffix, r'.*_(.*)') BETWEEN FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)) AND FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE(), INTERVAL 0 DAY))
GROUP BY tmp, date, channel, country, browser, cat, os )
UNION ALL (
SELECT
'weekly_new_users' AS tmp,
date,
FORMAT_DATE("%W", parse_DATE('%Y%m%d', date)) week_date,
FORMAT_DATE("%m", parse_DATE('%Y%m%d', date)) month_date,
'' country,
channelGrouping channel,
'' browser,
'' cat,
'' os,
COUNT(DISTINCT CASE
WHEN totals.newVisits = 1 THEN CONCAT(fullvisitorid, CAST(visitid AS string)) END) visits
FROM `project_id.dataset_id.ga_sessions*`
WHERE
1 = 1
AND REGEXP_EXTRACT(_table_suffix, r'.*_(.*)') BETWEEN FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY))
AND FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE(), INTERVAL 0 DAY))
GROUP BY
tmp,
date,
channel,
country,
browser,
cat,
os )
UNION ALL (
SELECT
'monthly_new_users' AS tmp,
date,
FORMAT_DATE("%W", parse_DATE('%Y%m%d',
date)) week_date,
FORMAT_DATE("%m", parse_DATE('%Y%m%d',
date)) month_date,
'' country,
'' channel,
'' browser,
'' cat,
device.operatingSystem os,
COUNT(DISTINCT
CASE
WHEN totals.newVisits = 1 THEN CONCAT(fullvisitorid, CAST(visitid AS string)) END) visits
FROM
`project_id.dataset_id.ga_sessions*`
WHERE
1 = 1
AND REGEXP_EXTRACT(_table_suffix, r'.*_(.*)') BETWEEN FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY))
AND FORMAT_DATE("%Y%m%d", DATE_SUB(CURRENT_DATE(), INTERVAL 0 DAY))
GROUP BY
tmp,
date,
channel,
country,
browser,
cat,
os ) ) )
GROUP BY
date,
country,
channel,
browser,
cat,
os
HAVING
(country != ''
AND channel != ''
AND browser != ''
AND cat != ''
AND os != '')
Basically on each UNION I create a key and later on I made an aggregation based on that key and the values you want to analyze. After that, I just removed the fields created as empty strings.
I tried processing 30 days of data which consumed a few gigas and still it completed in less then 20 seconds so it might work for you as well (notice that here by running the unions separately and then aggregating later you end up working in a much smaller data volume which avoids resource depletion).
Upvotes: 0