Reputation: 127
I'm looking at the particular actions of one visitor on our site just to gain an understanding of how some data is collected and how to pull the data that I need. Basically specific promotions are fired when a visitor sees certain sections of our site and I wanted to see if this data is being collected.
I've run the following code to identify the pages a visitor sees and the events their session fires:
select fullvisitorid, visitid, date, hitnumber, type, page.pagepath,
eventinfo.eventcategory, eventinfo.eventlabel, eventinfo.eventaction
from `big-query-156009.xxxxxx.ga_sessions_*` t, t.hits as hits
where _table_suffix between "20170511" and "20170511"
and fullvisitorid = 'xxxxxxxxxxxxxxxxxxxxxxxxxx'
and visitid = xxxxxxxxx
order by hitnumber
This works perfectly fine and I can see the journey of the visitor. I also want to pull in
hits.promotion.promoid,
hits.promotion.promoname,
hits.promotion.promocreative,
hits.promotion.promoposition,
hits.promotionactioninfo.promoisview,
hits.promotionactioninfo.promoisclick
I've tried this using the following code:
select fullvisitorid,
visitid,
date,
hitnumber,
type,
page.pagepath,
eventinfo.eventcategory,
eventinfo.eventlabel,
eventinfo.eventaction,
promotion.promoId,
promotion.promoname,
promotion.promocreative,
promotion.promoposition,
promotionactioninfo.promoIsView,
promotionactioninfo.promoIsclick
from `big-query-156009.xxxxxx.ga_sessions_*` t, t.hits as hits,
hits.promotion as promotion
where _table_suffix between "20170511" and "20170511"
and fullvisitorid = 'xxxxxxxxxxxxxxxxxxxxxxxxxx'
and visitid = xxxxxxxxx
order by hitnumber
however when I do this will exactly the same fullvisitorid, visitid, dates and dataset etc I get no results. No errors or anything just no results.
I don't fully understanding nesting so I'm assuming I'm just missing something simple.
Is there any way around this to pull all this data in one query or do I need to do sub-queries?
Thank you
Upvotes: 0
Views: 1314
Reputation: 11777
Another technique that may help you in these type of analyzes is running as follows:
SELECT
fullvisitorid fv,
visitid v,
ARRAY(SELECT AS STRUCT hits.hitnumber, hits.type, page.pagepath, eventinfo.eventcategory, promotion FROM UNNEST(hits) hits ORDER BY hitnumber) hits
FROM `big-query-156009.xxxxxx.ga_sessions_*`
WHERE
1 = 1
AND EXISTS(SELECT 1 FROM UNNEST(hits) hits WHERE ARRAY_LENGTH(hits.promotion) > 0)
Notice that the UNNESTING
process now happens only in the row level, not as a CROSS-JOIN.
You can also use the WHERE clause to filter out what exactly you want (in this case I removed all rows that didn't have any promotion ID associated to them. You can also for instance filter out only certain eventcategories or labels if you want).
Upvotes: 0
Reputation: 33705
By using the comma (join) operator between the table, t.hits
, and hits.promotion
, you are taking the cross product of the elements of the arrays. What's happening in this case is that because one of the arrays is empty (probably hits.promotion
), you get an empty result since 1 * <number of hits> * 0 = 0
. To get a row even when one of the arrays is empty, use a left join instead, e.g.:
from `big-query-156009.xxxxxx.ga_sessions_*` t
LEFT JOIN UNNEST(t.hits) as hits
LEFT JOIN UNNEST(hits.promotion) as promotion
where ...
Upvotes: 3