Reputation: 11
I am working with Google Analytics data in BigQuery and trying to combine Sessions Ids with Transaction IDs. I am facing some difficulties as I do not understand yet how to combin Session Ids (to match with GA UI I've used totals.visits = 1
in WHERE clause) with Transactions IDs (UNNEST(hits)
). Condition: Data should match with the GA UI.
Desired outcome
Date | Session ID | Transaction ID
2019-06-18 00:00:00.000 UTC | 400101010101 | AAAAAAA
2019-06-18 00:00:00.000 UTC | 400202010102 | BBBBBBB
2019-06-18 00:00:00.000 UTC | 400303010103 | CCCCCC
Current query to retrieve Timestamp and SessionIds
#standardSQL
SELECT
TIMESTAMP(PARSE_DATE('%Y%m%d', date)) AS Date,
CONCAT(fullVisitorId, CAST(visitId AS STRING) ) AS SessionID
COUNT(DISTINCT CONCAT(fullVisitorId, CAST(visitId AS STRING) )) AS Sessions, /* verify match with GA UI*/
FROM
`ga-xxxxx.yyyyyyyy_sessions_20*`
WHERE
_TABLE_SUFFIX = '180618'
AND totals.visits = 1
GROUP BY
1,2
Current query expended with TransactionID
When expending the query with hits.transaction.transactionId
and UNNEST(hits)
I get more records than in GA UI. Seems like the query is counting the SessionIDs plus TransactionIDs.
#standardSQL
SELECT
TIMESTAMP(PARSE_DATE('%Y%m%d', date)) AS Date,
CONCAT(fullVisitorId, CAST(visitId AS STRING) ) AS SessionID,
hits.transaction.transactionId AS transactionId,
COUNT(DISTINCT CONCAT(fullVisitorId, CAST(visitId AS STRING) )) AS Sessions, /* verify match with GA UI*/
FROM
`ga-xxxxx.yyyyyyyy_sessions_20*`,
UNNEST(hits) AS hits
WHERE
_TABLE_SUFFIX = '180618'
AND totals.visits = 1
GROUP BY
1,2,3
How can I get SessionID
and hits.transaction.transactionId
into the query with the condition that the data match with GA UI?
Hope somebody can help me out.
Thanks.
Upvotes: 1
Views: 4750
Reputation: 3369
To match the count of sessions in the GA UI back to BigQuery you can count sessions like this:
CONCAT(CAST(fullVisitorId AS STRING),CAST(visitId AS STRING)) AS sessions
Then, if you want to see a session ID along with the transaction ID, you could do something like:
SELECT Date,
CONCAT(CAST(fullVisitorId AS STRING),CAST(visitId AS STRING)) AS visit_ref,
hits.transaction.transactionID AS transaction_id
FROM `xxx.xxx.ga_sessions_20180619`
CROSS JOIN UNNEST(hits) AS hits
WHERE hits.transaction.transactionID IS NOT NULL
GROUP BY Date, visit_ref, transaction_id
Note that I have removed any sessions that did not transact in the above, but you could remove this WHERE
clause if you want them there.
If you are getting more rows using this method then consider that you may have sessions where a user places more than one order, you could get some insight into this by using STRING_AGG
to list all transaction ID's next to a session ID, in the example below I've also counted them and sorted by this count:
SELECT Date,
CONCAT(CAST(fullVisitorId AS STRING),CAST(visitId AS STRING)) AS visit_ref,
STRING_AGG(hits.transaction.transactionID) AS transaction_id,
COUNT(DISTINCT hits.transaction.transactionId) AS orders
FROM `xxx.xxx.ga_sessions_20180619`
CROSS JOIN UNNEST(hits) AS hits
WHERE hits.transaction.transactionID IS NOT NULL
GROUP BY Date, visit_ref
ORDER BY orders DESC
Upvotes: 3