Ben P
Ben P

Reputation: 3379

Count pageviews by page

I would like to count the number of pageviews by page in BigQuery, using Google Analytics data source tables. I only want to count pages that have the custom page content grouping of ProductList_UA or ProductDetails_UAand I want to trim all the parameters from the end of the page URL so that I return a more manageable list of pages.

So far, my query looks as below, but my count of pageviews, bounces and exits are far too high (about 8x) - where am I going wrong?

SELECT IFNULL(REGEXP_EXTRACT(hits.page.pagePath,r'^(.*?)\?'), hits.page.pagePath) AS Trimmed_Page, COUNT(hits.page.pagepath) AS Pageviews, SUM(totals.bounces) AS Bounces, SUM(IF(hits.isexit = TRUE, 1,0)) AS Exits, SUM(IF(hits.isentrance = TRUE, 1,0)) AS Entrances, MIN(hits.contentGroup.contentGroup3) AS Content_Group
FROM `xxx.ga_sessions_20*` AS m
  CROSS JOIN UNNEST(m.customdimensions) AS customDimension
  CROSS JOIN UNNEST(m.hits) AS hits
WHERE parse_date('%y%m%d', _table_suffix) between 
DATE_sub(current_date(), interval 1 day) and
DATE_sub(current_date(), interval 1 day)
AND (hits.contentGroup.contentGroup3 = 'ProductList_UA' OR hits.contentGroup.contentGroup3 = 'ProductDetails_UA')
AND hits.type="PAGE"
AND hits.isInteraction = TRUE
GROUP BY Trimmed_Page
ORDER BY Pageviews DESC
LIMIT 1000

Upvotes: 0

Views: 595

Answers (1)

Victor Mota
Victor Mota

Reputation: 1269

I suspect the cross join with customDimensions is the cause of you seeing more results than expected, as each row of hits will be multiplied by the number of customDimensions in that row. Experiment without that cross join to see if it solves the issue.

Upvotes: 1

Related Questions