william
william

Reputation: 11

Google Analytics Transactions Ids with Session Ids in BigQuery

I am working with Google Analytics data in BigQuery and trying to combine Sessions Ids with Transaction IDs. I am facing some difficulties as I do not understand yet how to combin Session Ids (to match with GA UI I've used totals.visits = 1 in WHERE clause) with Transactions IDs (UNNEST(hits)). Condition: Data should match with the GA UI.

Desired outcome

Date | Session ID | Transaction ID
2019-06-18 00:00:00.000 UTC | 400101010101 | AAAAAAA
2019-06-18 00:00:00.000 UTC | 400202010102 | BBBBBBB
2019-06-18 00:00:00.000 UTC | 400303010103 | CCCCCC

Current query to retrieve Timestamp and SessionIds

#standardSQL
SELECT
TIMESTAMP(PARSE_DATE('%Y%m%d', date)) AS Date,
CONCAT(fullVisitorId, CAST(visitId AS STRING) ) AS SessionID 
COUNT(DISTINCT CONCAT(fullVisitorId, CAST(visitId AS STRING) )) AS Sessions, /* verify match with GA UI*/
FROM
`ga-xxxxx.yyyyyyyy_sessions_20*`
WHERE
_TABLE_SUFFIX = '180618'
AND totals.visits = 1
GROUP BY
1,2

Current query expended with TransactionID

When expending the query with hits.transaction.transactionId and UNNEST(hits) I get more records than in GA UI. Seems like the query is counting the SessionIDs plus TransactionIDs.

#standardSQL
SELECT
TIMESTAMP(PARSE_DATE('%Y%m%d', date)) AS Date,
CONCAT(fullVisitorId, CAST(visitId AS STRING) ) AS SessionID, 
hits.transaction.transactionId AS transactionId, 
COUNT(DISTINCT CONCAT(fullVisitorId, CAST(visitId AS STRING) )) AS Sessions, /* verify match with GA UI*/
FROM
`ga-xxxxx.yyyyyyyy_sessions_20*`,                                  
 UNNEST(hits) AS hits
WHERE
_TABLE_SUFFIX = '180618'
AND totals.visits = 1
GROUP BY
1,2,3

How can I get SessionID and hits.transaction.transactionId into the query with the condition that the data match with GA UI?

Hope somebody can help me out.

Thanks.

Upvotes: 1

Views: 4750

Answers (1)

Ben P
Ben P

Reputation: 3369

To match the count of sessions in the GA UI back to BigQuery you can count sessions like this:

CONCAT(CAST(fullVisitorId AS STRING),CAST(visitId AS STRING)) AS sessions

Then, if you want to see a session ID along with the transaction ID, you could do something like:

    SELECT Date,
    CONCAT(CAST(fullVisitorId AS STRING),CAST(visitId AS STRING)) AS visit_ref,
    hits.transaction.transactionID AS transaction_id
    FROM `xxx.xxx.ga_sessions_20180619`
      CROSS JOIN UNNEST(hits) AS hits
    WHERE hits.transaction.transactionID IS NOT NULL
    GROUP BY Date, visit_ref, transaction_id

Note that I have removed any sessions that did not transact in the above, but you could remove this WHERE clause if you want them there.

If you are getting more rows using this method then consider that you may have sessions where a user places more than one order, you could get some insight into this by using STRING_AGG to list all transaction ID's next to a session ID, in the example below I've also counted them and sorted by this count:

   SELECT Date,
   CONCAT(CAST(fullVisitorId AS STRING),CAST(visitId AS STRING)) AS visit_ref,
   STRING_AGG(hits.transaction.transactionID) AS transaction_id, 
   COUNT(DISTINCT hits.transaction.transactionId) AS orders
   FROM `xxx.xxx.ga_sessions_20180619`
     CROSS JOIN UNNEST(hits) AS hits
   WHERE hits.transaction.transactionID IS NOT NULL
   GROUP BY Date, visit_ref
   ORDER BY orders DESC

Upvotes: 3

Related Questions