Reputation: 113
I am trying to identify the retention period of new customers that we acquire every month. Have identified the New Customer logic from the transactions, I have no lead on how to start M1 to M10
I need to get something like below, to explain the table, in the month of Jan we have acquired 2500 customers, of those 2.5k new customers only 1600 transacted in M1(Feb), of those 1600 only 1200 transacted in M2(Mar) and so on
Similarly, in the month of Feb we have acquired 2k customers, of those only 1100 transacted in M1(here M1 refers to Mar), of those 1100 only 800 transacted in M2(here M2 refers to Apr)
M2 is subset of M1, M3 is subset of M2 and so on.
Using SQL Server 2012, I want to avoid preprocessing of data due to certain limitation on my role and access. Any leads with sql logic will help.
Upvotes: 0
Views: 730
Reputation: 1261
Based on Gordon's answer, I propose the solution: http://sqlfiddle.com/#!18/f6785/3
select
year(first_yyyymm),
month(first_yyyymm),
count(distinct customer_id) as new_customers,
sum(case when seqnum = 1 then 1 else 0 end) as m1,
sum(case when seqnum = 2 then 1 else 0 end) as m2,
sum(case when seqnum = 3 then 1 else 0 end) as m3,
sum(case when seqnum = 4 then 1 else 0 end) as m4,
sum(case when seqnum = 5 then 1 else 0 end) as m5,
sum(case when seqnum = 6 then 1 else 0 end) as m6,
sum(case when seqnum = 7 then 1 else 0 end) as m7,
sum(case when seqnum = 8 then 1 else 0 end) as m8,
sum(case when seqnum = 9 then 1 else 0 end) as m9,
sum(case when seqnum = 10 then 1 else 0 end) as m10
from
(
select
customer_id,
first_yyyymm, yyyymm,
datediff(month, first_yyyymm, yyyymm) as seqnum
from
(
select
customer_id,
eomonth(created_at) as yyyymm,
min(eomonth(created_at))
over (partition by customer_id) as first_yyyymm
from transactions t
group by customer_id, eomonth(created_at)
) t
) t
group by year(first_yyyymm), month(first_yyyymm)
order by month(first_yyyymm);
For the data:
The result shall be:
Here's another solution, computing just those customers with transactions in every month.
http://sqlfiddle.com/#!18/ad3803/2
Upvotes: 1
Reputation: 1269753
I would suggest the following:
row_number()
In SQL, this looks like:
select year(first_yyyymm), month(first_yyyymm),
count(*) as new_customers,
sum(case when seqnum = 1 then 1 else 0 end) as m1,
sum(case when seqnum = 2 then 1 else 0 end) as m2,
sum(case when seqnum = 3 then 1 else 0 end) as m3,
sum(case when seqnum = 4 then 1 else 0 end) as m4,
sum(case when seqnum = 5 then 1 else 0 end) as m5,
sum(case when seqnum = 6 then 1 else 0 end) as m6,
sum(case when seqnum = 7 then 1 else 0 end) as m7,
sum(case when seqnum = 8 then 1 else 0 end) as m8,
sum(case when seqnum = 9 then 1 else 0 end) as m9,
sum(case when seqnum = 10 then 1 else 0 end) as m10
from (select customer, eomonth(date) as yyyymm,
min(eomonth(date)) over (partition by customer) as first_eomonth,
row_number() over (partition by customer order by eomonth(date)) as seqnum
from transactions t
group by customer, eomonth(date)
) t
where datediff(month, first_yyyymm, yyyymm) = seqnum - 1
group by year(first_yyyymm), month(first_yyyymm)
order by min(first_yyyymm);
Upvotes: 1