Reputation: 3379
I am working on nested Google Analytics data to build a query, I need to unnest 3 levels in order to get all the fields I need, but once I have unnested the SUM() of my .totals
fields are far too high, I assume because their values are being repeated.
I have no way if identifying, for example, bounces, at hit level, so I need to use totals.bounces
to get this value.
How could I adjust my query below to get the correct sum for bounces and revenue, alongside my totals for the unnested values?
SELECT
customDimension.value AS UserID,
# Visits from this individual
SUM(totals.visits) AS visits,
# Orders from this individual
COUNT(DISTINCT hits.transaction.transactionId) AS orders,
# AOV
SAFE_DIVIDE(SUM(hits.transaction.transactionRevenue)/1000000 , COUNT(DISTINCT hits.transaction.transactionId)) AS AOV,
#Bounces from this individual
IFNULL(SUM(totals.bounces),
0) AS bounces,
IFNULL(SUM(hits.transaction.transactionRevenue)/1000000,
0) AS revenue,
# Conversion rate of the individual
SAFE_DIVIDE(COUNT(DISTINCT hits.transaction.transactionId),COUNT(DISTINCT CONCAT(CAST(fullVisitorId AS STRING),CAST(visitId AS STRING)))) AS conversion_rate,
ROUND(IFNULL(SUM(totals.bounces)/COUNT(DISTINCT CONCAT(CAST(fullVisitorId AS STRING),CAST(visitId AS STRING))),
0),5) AS bounce_rate,
FROM
`MY.DATA.ga_sessions_20*` AS t
CROSS JOIN
UNNEST (hits) AS hits
CROSS JOIN
UNNEST(t.customdimensions) AS customDimension
CROSS JOIN
UNNEST(hits.product) AS hits_product
WHERE parse_date('%y%m%d', _table_suffix) between
DATE_sub(current_date(), interval 7 day) and
DATE_sub(current_date(), interval 1 day)
AND customDimension.index = 2
AND customDimension.value NOT LIKE "true"
AND customDimension.value NOT LIKE "false"
AND customDimension.value NOT LIKE "undefined"
AND customDimension.value IS NOT NULL
GROUP BY
UserID,
hits.eventInfo.eventCategory
Upvotes: 2
Views: 424
Reputation: 3379
Now that I have a little more experience with BigQuery, I can answer this question based on how I achieve this today.
Using WITH()
I create multiple queries and includes onto the fields I need at those levels, for example, my first WITH()
statement will have no unnesting and will correctly sum the totals.
fields. A second WITH()
can then UNNEST()
the hits level and sum the fields I want to count within here.
These queries can then be united with a common join and will show the correct values, without duplication, at each level of the unnesting.
Upvotes: 1