Reputation: 21
I have written a query in Google Big Query and want to get the same number of users I see in Google Analytics. I used Legacy and Normal SQL and got 3 different users numbers while the sessions were the same. What did I do wrong, or does anyone have an explanation/solution for it? Every help is appreciated!
Normal SQL
SELECT COUNT(DISTINCT fullVisitorId) AS users, SUM(IF(totals.visits IS
NULL,0,totals.visits)) AS sessions
FROM `XXX.XXX.ga_sessions_*`
WHERE _TABLE_SUFFIX BETWEEN '20181120' AND '20181120'
Legacy SQL
SELECT COUNT(DISTINCT fullVisitorId) AS users, SUM(IF(totals.visits IS
NULL,0,totals.visits)) AS sessions
FROM TABLE_DATE_RANGE([XXX:XXX.ga_sessions_], TIMESTAMP('2018-11-20'),
TIMESTAMP('2018-11-20'))
Upvotes: 1
Views: 79
Reputation: 1269443
I think this warning from the documentation explains what is happening:
In legacy SQL, COUNT(DISTINCT x) returns an approximate count. In standard SQL, it returns an exact count.
StandardSQL has the correct number. You can test this by attempting to use EXACT_COUNT_DISTINCT()
in legacy SQL.
Upvotes: 2