Reputation: 259
I have this formula for the customer retention rate but kinda stuck in translating it into a sql code in snowflake:
Customer Retention rate: number of customers who purchased in the past AND in the period of [last 30 days] / number of customers who have purchased in the past [not including last 30 days]
Upvotes: 0
Views: 250
Reputation: 25928
if we check for each customer if they have "old sales" and "new sales"
select customer_id, min(date) as min_date, max(date) as max_date, min_date < (current_date()-30) as old_sales, max_date >= (current_date()-30) as new_sales
from values
(1,'2022-05-01'),
(2,'2022-05-01'),
(2,'2022-04-01'),
(3,'2022-04-01'),
(4,'2022-04-01'),
(4,'2022-03-01')
t(customer_id, date)
group by 1
we get:
CUSTOMER_ID | MIN_DATE | MAX_DATE | OLD_SALES | NEW_SALES |
---|---|---|---|---|
1 | 2022-05-01 | 2022-05-01 | FALSE | TRUE |
2 | 2022-04-01 | 2022-05-01 | TRUE | TRUE |
3 | 2022-04-01 | 2022-04-01 | TRUE | FALSE |
4 | 2022-03-01 | 2022-04-01 | TRUE | FALSE |
we then want to count both as true for the top, and trop/false for the bottom, we can use:
select count_if(old_sales and new_sales) as top
,count_if(old_sales and new_sales=false) as bottom
,div0(top, bottom) as rention
from (
select customer_id, min(date) as min_date, max(date) as max_date, min_date < (current_date()-30) as old_sales, max_date >= (current_date()-30) as new_sales
from values
(1,'2022-05-01'),
(2,'2022-05-01'),
(2,'2022-04-01'),
(3,'2022-04-01'),
(4,'2022-04-01'),
(4,'2022-03-01')
t(customer_id, date)
group by 1
);
and get:
TOP | BOTTOM | RENTION |
---|---|---|
1 | 2 | 0.5 |
thus with a data_table
as our source we can use:
select div0(count_if(old_sales and new_sales), count_if(old_sales and new_sales=false)) as rention
from (
select
min(date) < (current_date()-30) as old_sales,
max(date) >= (current_date()-30) as new_sales
from data_table
group by customer_id
);
to get the magic 50%
Upvotes: 3