Smasell
Smasell

Reputation: 1228

SQL cohort calculations

I have my table of players activity like this:

user_id event_name install_date event_date
1 active 2021-03-01 2021-03-01
1 active 2021-03-01 2021-03-01
1 active 2021-03-01 2021-03-02
2 active 2021-03-02 2021-03-02
2 active 2021-03-02 2021-03-04
2 active 2021-03-02 2021-03-04

and I want to calculate cohort retention like this

user_id install_date ret0 ret1 ret2
1 2021-03-01 1 1 0
2 2021-03-02 1 0 1

Help me please to write sql query. Thanks)

Upvotes: 0

Views: 120

Answers (2)

Mikhail Berlyant
Mikhail Berlyant

Reputation: 172994

Consider below approach - less verbose and better manageable and expandable to more generic cases

select * from (
  select user_id, install_date, date_diff(event_date, install_date, day) diff
  from `project.dataset.table`
)
pivot (count(diff) as ret for diff in (0, 1, 2))      

if applied to sample data in your question - output is

enter image description here

Btw, if you want to output 1 or 0 in respective columns - you can adjust above to

select * from (
  select user_id, install_date, date_diff(event_date, install_date, day) diff
  from `project.dataset.table` 
  group by 1,2,3
)
pivot (count(diff) as ret for diff in (0, 1, 2))          

in this case - output is

enter image description here

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1269703

If I understand correctly, you just want to compare the event_date to the install_date and keep track of when "x" days appear between the two:

select user_id,
       max(case when event_date = install_date then 1 else 0 end) as ret1,
       max(case when event_date = date_add(install_date, interval 1 day) then 1 else 0 end) as ret1,
       max(case when event_date = date_add(install_date, interval 2 day) then 1 else 0 end) as ret2
from t
group by user_id;

Upvotes: 1

Related Questions