Akshay Singh
Akshay Singh

Reputation: 67

In Redshift, how do I run the opposite of a SUM function

Assuming I have a data table

date        |   user_id  | user_last_name | order_id  | is_new_session
------------+------------+----------------+-----------+---------------
 2014-09-01 | A          | B              | 1         | t
 2014-09-01 | A          | B              | 5         | f
 2014-09-02 | A          | B              | 8         | t
 2014-09-01 | B          | B              | 2         | t
 2014-09-02 | B          | test           | 3         | t
 2014-09-03 | B          | test           | 4         | t
 2014-09-04 | B          | test           | 6         | t
 2014-09-04 | B          | test           | 7         | f
 2014-09-05 | B          | test           | 9         | t
 2014-09-05 | B          | test           | 10        | f

I want to get another column in Redshift which basically assigns session numbers to each users session. It starts at 1 for the first record for each user and as you move further down, if it encounters a true in the "is_new_session" column, it increments. Stays the same if it encounters a false. If it hits a new user, the value resets to 1. The ideal output for this table would be:

1
1
2
1
2
3
4
4
5
5

In my mind it's kind of the opposite of a SUM(1) over (Partition BY user_id, is_new_session ORDER BY user_id, date ASC)

Any ideas?

Thanks!

Upvotes: 1

Views: 239

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1271003

I think you want an incremental sum:

select t.*,
       sum(case when is_new_session then 1 else 0 end) over (partition by user_id order by date) as session_number
from t;

In Redshift, you might need the windowing clause:

select t.*,
       sum(case when is_new_session then 1 else 0 end) over
           (partition by user_id
            order by date
            rows between unbounded preceding and current row
           ) as session_number
from t;

Upvotes: 2

Related Questions