Thomas Chamberlain
Thomas Chamberlain

Reputation: 127

Why when unnesting two or more variables in BigQuery do I get no results?

I'm looking at the particular actions of one visitor on our site just to gain an understanding of how some data is collected and how to pull the data that I need. Basically specific promotions are fired when a visitor sees certain sections of our site and I wanted to see if this data is being collected.

I've run the following code to identify the pages a visitor sees and the events their session fires:

select fullvisitorid, visitid, date, hitnumber, type, page.pagepath, 
eventinfo.eventcategory, eventinfo.eventlabel, eventinfo.eventaction
from `big-query-156009.xxxxxx.ga_sessions_*` t, t.hits as hits
where _table_suffix between "20170511" and "20170511"
and fullvisitorid = 'xxxxxxxxxxxxxxxxxxxxxxxxxx'
and visitid = xxxxxxxxx
order by hitnumber

This works perfectly fine and I can see the journey of the visitor. I also want to pull in

hits.promotion.promoid, 
hits.promotion.promoname, 
hits.promotion.promocreative, 
hits.promotion.promoposition, 
hits.promotionactioninfo.promoisview,
hits.promotionactioninfo.promoisclick

I've tried this using the following code:

select fullvisitorid, 
visitid, 
date, 
hitnumber, 
type, 
page.pagepath, 
eventinfo.eventcategory, 
eventinfo.eventlabel, 
eventinfo.eventaction, 
promotion.promoId, 
promotion.promoname, 
promotion.promocreative, 
promotion.promoposition,
promotionactioninfo.promoIsView,
promotionactioninfo.promoIsclick
from `big-query-156009.xxxxxx.ga_sessions_*` t, t.hits as hits,
hits.promotion as promotion
where _table_suffix between "20170511" and "20170511"
and fullvisitorid = 'xxxxxxxxxxxxxxxxxxxxxxxxxx'
and visitid = xxxxxxxxx
order by hitnumber

however when I do this will exactly the same fullvisitorid, visitid, dates and dataset etc I get no results. No errors or anything just no results.

I don't fully understanding nesting so I'm assuming I'm just missing something simple.

Is there any way around this to pull all this data in one query or do I need to do sub-queries?

Thank you

Upvotes: 0

Views: 1314

Answers (2)

Willian Fuks
Willian Fuks

Reputation: 11777

Another technique that may help you in these type of analyzes is running as follows:

SELECT
    fullvisitorid fv,
    visitid v,
    ARRAY(SELECT AS STRUCT hits.hitnumber, hits.type, page.pagepath, eventinfo.eventcategory, promotion FROM UNNEST(hits) hits ORDER BY hitnumber) hits
FROM `big-query-156009.xxxxxx.ga_sessions_*`
WHERE
1 = 1
AND EXISTS(SELECT 1 FROM UNNEST(hits) hits WHERE ARRAY_LENGTH(hits.promotion) > 0)

Notice that the UNNESTING process now happens only in the row level, not as a CROSS-JOIN.

You can also use the WHERE clause to filter out what exactly you want (in this case I removed all rows that didn't have any promotion ID associated to them. You can also for instance filter out only certain eventcategories or labels if you want).

Upvotes: 0

Elliott Brossard
Elliott Brossard

Reputation: 33705

By using the comma (join) operator between the table, t.hits, and hits.promotion, you are taking the cross product of the elements of the arrays. What's happening in this case is that because one of the arrays is empty (probably hits.promotion), you get an empty result since 1 * <number of hits> * 0 = 0. To get a row even when one of the arrays is empty, use a left join instead, e.g.:

from `big-query-156009.xxxxxx.ga_sessions_*` t
LEFT JOIN UNNEST(t.hits) as hits
LEFT JOIN UNNEST(hits.promotion) as promotion
where ...

Upvotes: 3

Related Questions