Maarten Berge
Maarten Berge

Reputation: 52

Counting google analytics unique events in BigQuery

I have managed to calculate total events by ISOweek but not unique events for a given Google Analytics Event using BigQuery. When checking GA, total_events matches the GA interface on the dot but unique_events are off. Do you know how I can solve this?

The query:

SELECT INTEGER(STRFTIME_UTC_USEC(PARSE_UTC_USEC(date),"%V")) iso8601_week_number,
hits.eventInfo.eventCategory,
hits.eventInfo.eventAction,
COUNT(hits.eventInfo.eventCategory) AS total_events,
EXACT_COUNT_DISTINCT(fullVisitorId) AS unique_events
FROM
    TABLE_DATE_RANGE([XXXXXX.ga_sessions_], TIMESTAMP('2017-05-01'), TIMESTAMP('2017-05-07'))
WHERE
  hits.type = 'EVENT' AND hits.eventInfo.eventCategory = 'BIG_Transaction'
GROUP BY
iso8601_week_number, hits.eventInfo.eventCategory, hits.eventInfo.eventAction

Upvotes: 1

Views: 12388

Answers (4)

Kevin Dasilva
Kevin Dasilva

Reputation: 61

We did something similar to what @Martin was suggesting with some cool CTEs and we were able to get an 100% match on what was coming out of Google Analytics from BigQuery.

Checkout the code snippet below that returns a per day sum of sessions + unique Add to Cart events:

   #standardSQL
WITH AN_ATC AS 
(
  SELECT
    -- full date w/ hyphens (ie 2021-01-07)
      CAST(format_date('%Y-%m-%d', parse_date("%Y%m%d", date)) AS DATE) as DATE,
      -- COUNT OF SESSIONS
      COUNT(DISTINCT CONCAT(fullVisitorId, CAST(visitStartTime AS STRING))) AS Sessions,
      -- COUNT OF UNIQUE EVENTS PER SESSION
       COUNT(DISTINCT CONCAT(fullvisitorid, CAST(visitstarttime AS string), 
        COALESCE(hits.eventinfo.eventaction,''), 
        COALESCE(hits.eventinfo.eventlabel, ''))) AS EVENTS
  
  FROM `an-big-query.PROJECT_ID.ga_sessions_*` , 
  UNNEST(hits) as hits
  WHERE
  -- start date
  _table_suffix BETWEEN '20190101'
  -- yesterday
  AND FORMAT_DATE('%Y%m%d',DATE_SUB(CURRENT_DATE(),INTERVAL 1 DAY)) 
  AND hits.eventInfo.eventAction = 'add to cart'
  GROUP BY 
    date
)
SELECT
DATE,
SESSIONS,
EVENTS
FROM AN_ATC
ORDER BY date DESC

Where,

SESSIONS = Google Analytics ga:Sessions

and

EVENTS = Google Analytics ga:uniqueEvents

BOTH with eventAction=@add to cart

Hope that helps everyone that was searching/googling!

Upvotes: 0

Martin Weitzmann
Martin Weitzmann

Reputation: 4736

Depending on the scope you need to count(distinct ) different things, but you always need to fulfill these conditions:

  • unique events refer to the combination of category, action and label
  • make sure eventAction is not NULL
  • make sure eventLabel is not NULL
  • eventCategory is allowed be NULL

I'm using COALESCE() to avoid NULLs

Example Session Scope

SELECT
  SUM( (SELECT COUNT(h.eventInfo.eventCategory) FROM t.hits h) ) events,
  SUM( (SELECT COUNT(DISTINCT 
    CONCAT( h.eventInfo.eventCategory,
      COALESCE(h.eventinfo.eventaction,''),
      COALESCE(h.eventinfo.eventlabel, ''))
      )
    FROM
      t.hits h ) ) uniqueEvents
FROM
  `google.com:analytics-bigquery.LondonCycleHelmet.ga_sessions_20130910` t

Example Hit Scope

SELECT
  h.eventInfo.eventCategory,
  COUNT(1) events,
  -- we need to take sessions into account, so we add fullvisitorid and visitstarttime
  COUNT(DISTINCT CONCAT(fullvisitorid, CAST(visitstarttime AS string), 
    COALESCE(h.eventinfo.eventaction,''), 
    COALESCE(h.eventinfo.eventlabel, ''))) uniqueEvents
FROM
  `google.com:analytics-bigquery.LondonCycleHelmet.ga_sessions_20130910` t,
  t.hits h
WHERE
  h.type='EVENT'
GROUP BY
  1
ORDER BY
  2 DESC

hth!

Upvotes: 4

Pol Ferrando
Pol Ferrando

Reputation: 673

The definition of unique events in Google Analytics is:

A count of the number of times an event with the category/action/label value was seen at least once within a session.

In other words, the number of sessions in which a specific event (defined by category, action AND label) was sent. In your query, you count the number of unique visitors that had the event, while you need to count the number of sessions and keep in mind that events with different labels should be counted as different unique events (although we are only interested in category and action).

A possible way to fix your code is:

SELECT 
  INTEGER(STRFTIME_UTC_USEC(PARSE_UTC_USEC(date),"%V")) iso8601_week_number,
  hits.eventInfo.eventCategory,
  hits.eventInfo.eventAction,
  COUNT(hits.eventInfo.eventCategory) AS total_events,
  EXACT_COUNT_DISTINCT(CONCAT(fullVisitorId,'-',string(visitId),'-',date,'-',ifnull(hits.eventInfo.eventLabel,'null'))) AS unique_events    
FROM
  TABLE_DATE_RANGE([XXXXXX.ga_sessions_], TIMESTAMP('2017-05-01'), TIMESTAMP('2017-05-07'))
WHERE
  hits.type = 'EVENT' AND hits.eventInfo.eventCategory = 'BIG_Transaction'
GROUP BY
    iso8601_week_number, hits.eventInfo.eventCategory, hits.eventInfo.eventAction

The results of this query should match with the data in the GA interface.

Upvotes: 1

andre622
andre622

Reputation: 515

I believe the issue is that you are only counting the number of unique visitors have completed the specified action, while GA defines unique events as "The number of times during a date range that a session contained the specific dimension".

Therefore, I would just change your code to the below:

SELECT INTEGER(STRFTIME_UTC_USEC(PARSE_UTC_USEC(date),"%V")) iso8601_week_number,
hits.eventInfo.eventCategory,
hits.eventInfo.eventAction,
COUNT(hits.eventInfo.eventCategory) AS total_events,
EXACT_COUNT_DISTINCT(CONCAT(fullVisitorId, STRING(visitId))) AS unique_events
FROM
    TABLE_DATE_RANGE([XXXXXX.ga_sessions_], TIMESTAMP('2017-05-01'), TIMESTAMP('2017-05-07'))
WHERE
  hits.type = 'EVENT' AND hits.eventInfo.eventCategory = 'BIG_Transaction'
GROUP BY
iso8601_week_number, hits.eventInfo.eventCategory, hits.eventInfo.eventAction

This should give you the distinct count of sessions that had the given events.

Upvotes: 0

Related Questions