Christien
Christien

Reputation: 1

How can I reset running total in SQL BigQuery after condition in other column is met?

In BigQuery, I have a data set from Google Analytics on hit level containing, amongst others: date, fullVisitorId, visitId, transactionRevenue, and pageHome (whether or not the page someone is on is the homepage or not). For each visitor, I want to calculate the running total of pageHome until a transaction is made. If a transaction is made, then transactionRevenue is a number, otherwise it's null. For an example, please see the attached picture.

Example table running total

I have tried a sum over the partition by fullVisitorId and ordering by visitId. However, I'm not sure how to reset when transactionId is not null.

SELECT
    date,
    fullVisitorId,
    visitId,
    visitNumber,
    transactionRevenue,
    pageHome,
    SUM(pageHome) OVER(PARTITION BY fullVisitorId ORDER BY visitId ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS sumpageHome

This code calculates the running total per visitor, but it does not reset yet after a transaction is made.

Upvotes: 0

Views: 1217

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1270873

For each visitor, I want to calculate the running total of pageHome until a transaction is made.

Use a subquery to calculate the transaction number. Then use that in a cumulative sum:

SELECT SUM(pageHome) OVER (PARTITION BY fullVisitorId,  transaction_count ORDER BY DATE) as running_pageHome
FROM (SELECT t.*,
             COUNT(TransactionRevenue) OVER (PARTITION BY fullVisitorId ORDER BY DATE) as transaction_count
      FROM t
     ) t

Upvotes: 2

Related Questions