Richard Rymer
Richard Rymer

Reputation: 153

Big Query and Google Analytics UI do not match when ecommerce action filter applied

We are validating a query in Big Query, and cannot get the results to match with the google analytics UI. A similar question can be found here, but in our case the the mismatch only occurs when we apply a specific filter on ecommerce_action.action_type.

Here is the query:

SELECT COUNT(distinct fullVisitorId+cast(visitid as string)) AS sessions
FROM (
  SELECT
    device.browserVersion,
    geoNetwork.networkLocation,
    geoNetwork.networkDomain,
    geoNetwork.city,
    geoNetwork.country,
    geoNetwork.continent,
    geoNetwork.region,
    device.browserSize,
    visitNumber,
    trafficSource.source,
    trafficSource.medium,
    fullvisitorId,
    visitId,
    device.screenResolution,
    device.flashVersion,
    device.operatingSystem,
    device.browser,
    totals.pageviews,
    channelGrouping,
    totals.transactionRevenue,
    totals.timeOnSite,
    totals.newVisits,
    totals.visits,
    date,
    hits.eCommerceAction.action_type
  FROM
    (select * 
     from TABLE_DATE_RANGE([zzzzzzzzz.ga_sessions_], 
    <range>) ))t
WHERE
 hits.eCommerceAction.action_type = '2' and <stuff to remove bots>
  )

From the UI using the built in shopping behavior report, we get 3.836M unique sessions with a product detail view, compared with 3.684M unique sessions in Big Query using the query above.

A few questions: 1) We are under the impression the shopping behavior report "Sessions with Product View" breakdown is based off of the ecommerce_action.actiontype filter. Is that true? 2) Is there a .totals pre-aggregated table that the UI maybe pulling from?

Upvotes: 1

Views: 1253

Answers (3)

Martin Weitzmann
Martin Weitzmann

Reputation: 4746

You're including product list views in your query. As described in https://support.google.com/analytics/answer/3437719 you need to make sure, that no product has isImpression = TRUE because that would mean it is a product list view.

This query sums all sessions which contain any action_type='2' for which all isProduct are null or false:

SELECT
  SUM(totals.visits) AS sessions
FROM
  `project.123456789.ga_sessions_20180101` AS t
WHERE
  (
  SELECT
    LOGICAL_OR(h.ecommerceaction.action_type='2')
  FROM
    t.hits AS h
  WHERE
    (SELECT LOGICAL_AND(isimpression IS NULL OR isimpression = FALSE) FROM h.product))

For legacySQL you can adapt the example in the documentation.

Upvotes: 1

Pol Ferrando
Pol Ferrando

Reputation: 673

In addition to the fact that COUNT(DISTINCT ...) is approximate when using legacy SQL, there could be sessions in which there are only non-interactive hits, which will not be counted as sessions in the Google Analytics UI but they are counted by both COUNT(DISTINCT ...) and EXACT_COUNT_DISTINCT(...) because in your query they count visit id's.

Using SUM(totals.visits) you should get the same result as in the UI because SUM does not take into account NULL values of totals.visits (corresponding to sessions in which there are only non-interactive hits).

Upvotes: 1

Elliott Brossard
Elliott Brossard

Reputation: 33745

It sounds like the issue is that COUNT(DISTINCT ...) is approximate when using legacy SQL, as noted in the migration guide, so the counts are not accurate. Either use standard SQL instead (preferred) or use EXACT_COUNT_DISTINCT with legacy SQL.

Upvotes: 2

Related Questions