Reputation: 115
I'm trying to extract all columns in bigquery google analytics data WHERE I filtered by hostname. However, the result is still showing the other hostnames as seen on screenshot. I'm doing this select with a filter to copy data to another dataset with the same schema.
QUERY BELOW:
SELECT g.* FROM `10241241.ga_sessions_20160805` g, UNNEST (hits) as hits WHERE hits.page.hostname= "www.googlemerchandisestore.com"
UPDATED How can I copy the same schema without changing its nested structure?
Upvotes: 0
Views: 1686
Reputation: 4736
The values you want to filter for are sitting in an array ... so you can filter/clean the array or exclude rows with arrays not containing that value or exclude rows that contain not only that value. Not sure which applies for you, since you don't seem to have a reason to cross join the array with the whole table (which bloats it up tremendously).
Cleaning the array:
SELECT
*
REPLACE(
ARRAY(SELECT AS STRUCT * FROM g.hits WHERE page.hostname='www.googlemerchandisestore.com') AS hits
)
FROM `bigquery-public-data.google_analytics_sample.ga_sessions_20160805` g
limit 1000
Exclude rows that don't contain the value
SELECT
*
FROM `bigquery-public-data.google_analytics_sample.ga_sessions_20160805` g
WHERE
(SELECT COUNT(1)>0 FROM g.hits WHERE page.hostname='www.googlemerchandisestore.com')
limit 1000
etc.
Upvotes: 2
Reputation: 2365
You shouldn't add hits column again. Also, I'd recommend you to use a different name for unnested column. It'll be easier to understand, then.
SELECT g.* EXCEPT(hits), hit
FROM `10241241.ga_sessions_20160805` g, UNNEST (hits) as hit
WHERE hits.page.hostname= "www.googlemerchandisestore.com"
Upvotes: 0