Cohort analysis with Amazon Redshift / PostgreSQL

Question

I'm trying analyze user retention using a cohort analysis based on event data stored in Redshift.

For example, in Redshift I have:

timestamp          action        user id
---------          ------        -------
2015-05-05 12:00   homepage      1
2015-05-05 12:01   product page  1
2015-05-05 12:02   homepage      2
2015-05-05 12:03   checkout      1

I would like to extract the daily retention cohort. For example:

signup_day  users_count d1  d2  d3  d4  d5  d6  d7 
----------  ----------- --  --  --  --  --  --  --  
2015-05-05  100         80  60  40  20  17  16  12
2015-05-06  150         120 90  60  30  22  18  15

Where signup_day represents the first date we have a record of a user action, users_count is the total amount of users who signed up on signup_day, d1 is the number of users who performed any action a day after signup_day etc...

Is there a better way to represent the retention analysis data?

What would be the best query to achieve that with Amazon Redshift? Is it possible to do with a single query?

Cohort analysis with Amazon Redshift / PostgreSQL

Answers (1)

Related Questions