TStienlet
TStienlet

Reputation: 3

Reasons for having pageview entries with a hour-minute combination in BigQuery but not in Google Analytics?

I'm trying to query the number of pageviews for a specific page in BigQuery (using count(p.page.pagePath)) but have more pageviews in Bigquery then in GA.

When comparing hour-minute combinations for a specific page (page title = Moteurs - Le Soir Plus) for a specific date (10th of October), I get hour-minute combinations in BigQuery that do not exist in GA.

How is this possible? Does GA not register some pageviews that BigQuery does?

Here is the query I use :

#standardSQL
CREATE TEMP FUNCTION
  customDimensionByIndex(indx INT64,
    arr ARRAY<STRUCT<index INT64,
    value STRING>>) AS ( (
    SELECT
      x.value
    FROM
      UNNEST(arr) x
    WHERE
      indx=x.index) );
SELECT distinct
  p.page.pageTitle,
  date,
  p.hour,
  p.minute,
  count(p.page.pagepath) as Pageviews
FROM
  `ga-ls-sw-233509.186661177.ga_sessions_*` AS st,
  UNNEST(hits) AS P
WHERE
  _TABLE_SUFFIX = FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))
  AND p.page.pageTitle = 'Moteurs - Le Soir Plus'
Group by
p.page.pageTitle,
date,
p.hour,
P.minute
ORDER BY
  p.hour DESC

Here are the two outputs (BQ and GA) for the 10th of October:

BQ:
Row page Title          date            hour    minute  Page views   
1   Moteurs - Le Soir Plus  20191010    16  13  1    
2   Moteurs - Le Soir Plus  20191010    16  14  1    
3   Moteurs - Le Soir Plus  20191010    16  16  1    
4   Moteurs - Le Soir Plus  20191010    14  29  1    
5   Moteurs - Le Soir Plus  20191010    14  32  1    
6   Moteurs - Le Soir Plus  20191010    11  19  1    
7   Moteurs - Le Soir Plus  20191010    11  20  1    
8   Moteurs - Le Soir Plus  20191010    9   17  1    
9   Moteurs - Le Soir Plus  20191010    9   24  1    
10  Moteurs - Le Soir Plus  20191010    8   18  1    
11  Moteurs - Le Soir Plus  20191010    8   23  1    
12  Moteurs - Le Soir Plus  20191010    8   24  1    
13  Moteurs - Le Soir Plus  20191010    8   29  1
14  Moteurs - Le Soir Plus  20191010    8   30  1
GA:
        Page Title          Date            Hour    Minute  Page views
1.      Moteurs - Le Soir Plus  20191010    16  13  1
2.      Moteurs - Le Soir Plus  20191010    16  14  1
3.      Moteurs - Le Soir Plus  20191010    14  29  1
4.      Moteurs - Le Soir Plus  20191010    11  19  1
5.      Moteurs - Le Soir Plus  20191010    09  17  1
6.      Moteurs - Le Soir Plus  20191010    08  18  1
7.      Moteurs - Le Soir Plus  20191010    08  23  1
8.      Moteurs - Le Soir Plus  20191010    08  29  1

Ex. the hour-minute combination 16h16 exists in BQ but not in GA.

Upvotes: 0

Views: 140

Answers (1)

Martin Weitzmann
Martin Weitzmann

Reputation: 4736

From what I can see, you're counting hits and not pageviews. Make sure P.type='PAGE' in the WHERE clause

Upvotes: 1

Related Questions