Aaron Harris
Aaron Harris

Reputation: 415

Bigquery Google Analytics Users Not Counting Unique

I'm having issues with the below query in BigQuery for Google Analytics. For some reason I'm not able to count the number of users as unique, it essentially counts the number of rows so the numbers are very similar to sessions. I've also tried EXACT_COUNT_DISTINCT() but gives me the same answer.

    SELECT
  date AS Day,
  MAX(CASE
      WHEN hits.sourcePropertyInfo.sourcePropertyTrackingId CONTAINS '778****' THEN 'MUG'
      WHEN hits.sourcePropertyInfo.sourcePropertyTrackingId = 'Social' THEN 'Social'ELSE 'Website' END) AS Property,
  geoNetwork.country AS Country,
  SUM(totals.visits) AS visits,
  COUNT (DISTINCT(fullVisitorId), 1000000) AS Users,
  SUM(IFNULL(totals.newVisits,0)) AS NEW,
  (SUM(IFNULL(totals.screenviews,0))+SUM(IFNULL(totals.pageviews,0))) AS PAGEVIEWS,
  IFNULL(SUM(CASE
        WHEN totals.screenviews = 1 THEN SUM(IFNULL(totals.screenviews,0))
        ELSE 0 END)+ SUM(IFNULL(totals.bounces,0)),0) AS BOUNCES,
  SUM(CASE
      WHEN REGEXP_MATCH(hits.eventInfo.eventAction,'register$|registersuccess|new registration|account signup|registro') THEN 1
      ELSE 0 END) AS NewRegistrations,
  SUM(CASE
      WHEN REGEXP_MATCH(hits.eventInfo.eventAction, 'add to cart|add to bag|click to buy|ass to basket|comprar') OR hits.eventInfo.eventAction CONTAINS 'addtobasket::' THEN 1
      ELSE 0 END) AS ClickToBuy,
  SUM(IFNULL(totals.transactions,0)) AS Transactions,
  SUM(IFNULL(totals.transactionRevenue,0))/1000000 AS Revenue
FROM (TABLE_DATE_RANGE([****.ga_sessions_], TIMESTAMP('2017-03-15'), TIMESTAMP('2017-03-31'))),
GROUP BY
  Day,
  Country,
  geoNetwork.country,
  totals.screenviews;

Upvotes: 1

Views: 750

Answers (1)

Willian Fuks
Willian Fuks

Reputation: 11777

I just tested this query and it seems a bit simpler:

SELECT
date,
MAX(CASE
     WHEN hits.sourcePropertyInfo.sourcePropertyTrackingId CONTAINS '778****' THEN 'MUG'
     WHEN hits.sourcePropertyInfo.sourcePropertyTrackingId = 'Social' THEN 'Social'ELSE 'Website' END) AS Property,
geoNetwork.country AS Country,
SUM(totals.visits) AS visits,
COUNT(DISTINCT(fullVisitorId), 1000000) AS Users,
SUM(totals.newVisits) AS NEW,
SUM(totals.pageviews) AS PAGEVIEWS,
SUM(totals.bounces) AS BOUNCES,
SUM(CASE
      WHEN REGEXP_MATCH(hits.eventInfo.eventAction,'register$|registersuccess|new registration|account signup|registro') THEN 1
      ELSE 0 END) AS NewRegistrations,
SUM(CASE
      WHEN REGEXP_MATCH(hits.eventInfo.eventAction, 'add to cart|add to bag|click to buy|ass to basket|comprar|addtobasket::') THEN 1
      ELSE 0 END) AS ClickToBuy,
SUM(totals.transactions) AS Transactions,
SUM(totals.transactionRevenue) /1000000 AS Revenue
FROM (TABLE_DATE_RANGE([project_id:dataset_id.ga_sessions_], TIMESTAMP('2017-03-15'), TIMESTAMP('2017-03-31'))),
GROUP BY
date, Country

It did work in our database (not sure why you summed screenviews with pageviews though).

In Standard SQL (highly recommended that you use this version) maybe this already solves for you:

SELECT
date,
MAX(CASE
     WHEN exists(select 1 from unnest(hits) hits where regexp_contains(hits.sourcePropertyInfo.sourcePropertyTrackingId, r'778\*\*\*\*')) THEN 'MUG'
     WHEN exists(select 1 from unnest(hits) hits where hits.sourcePropertyInfo.sourcePropertyTrackingId = 'Social') THEN 'Social'ELSE 'Website' END) AS Property,
geoNetwork.country AS Country,
SUM(totals.visits) AS visits,
COUNT(DISTINCT(fullVisitorId)) AS Users,
SUM(totals.newVisits) AS new_,
SUM(totals.pageviews) AS PAGEVIEWS,
SUM(totals.bounces) AS BOUNCES,
SUM(CASE
      WHEN exists(select 1 from unnest(hits) hits where REGEXP_contains(hits.eventInfo.eventAction,'register$|registersuccess|new registration|account signup|registro')) THEN 1
      ELSE 0 END) AS NewRegistrations,
SUM(CASE
      WHEN exists(select 1 from unnest(hits) hits where REGEXP_contains(hits.eventInfo.eventAction, 'add to cart|add to bag|click to buy|ass to basket|comprar|addtobasket::')) THEN 1
      ELSE 0 END) AS ClickToBuy,
SUM(totals.transactions) AS Transactions,
SUM(totals.transactionRevenue) /1000000 AS Revenue
FROM `project_id.dataset_id.ga_sessions*`
where 1 = 1
and parse_timestamp("%Y%m%d", regexp_extract(_table_suffix, r'.*_(.*)')) between  TIMESTAMP('2017-03-15') and  TIMESTAMP('2017-03-31')
GROUP BY
date, Country

Upvotes: 2

Related Questions