Reputation: 55
I am working in SQL Server 2008 R2 and having a hard time gathering new vs repeat customer orders.
I have data in this format:
OrderID OrderDate Customer OrderAmount
-----------------------------------------------
1 1/1/2017 A $10
2 1/2/2017 B $20
3 1/3/2017 C $30
4 4/1/2017 C $40
5 4/2/2017 D $50
6 4/3/2017 D $60
7 1/6/2018 B $70
Here's what we want:
This means that if a new customer places multiple orders in her first month, they would all be considered "new" customer orders. And orders placed in subsequent months would all be considered "repeat" customer orders.
We want to get New orders (count and sum) and Repeat orders (count and sum) per year, per month:
Year Month NewCount NewSum RepeatCount RepeatSum
-----------------------------------------------------------------------------
2017 1 3 (A,B,C) $60 (10+20+30) 0 $0
2017 4 2 (D,D) $110 (50+60) 1 (C) $40 (40)
2018 1 0 $0 1 (B) $70 (70)
(The info in () parenthesis is not part of the result; just putting it here for clarity)
The SQL is easy to write for any single given month, but I don't know how to do it when gathering years worth of months at a time...
If there is a month with no orders of any kind then NULL or 0 values for the year:month would be preferred.
Upvotes: 1
Views: 2635
Reputation: 4146
You can use dense_rank
to find new and old customers. This query returns your provided output
declare @t table (OrderID int, OrderDate date, Customer char(1), OrderAmount int)
insert into @t
values (1, '20170101', 'A', 10)
, (2, '20170102', 'B', 20), (3, '20170103', 'C', 30)
, (4, '20170401', 'C', 40), (5, '20170402', 'D', 50)
, (6, '20170403', 'D', 60), (7, '20180106', 'B', 70)
select
[year], [month], NewCount = isnull(sum(case when dr = 1 then 1 end), 0)
, NewSum = isnull(sum(case when dr = 1 then OrderAmount end), 0)
, RepeatCount = isnull(sum(case when dr > 1 then 1 end), 0)
, RepeatSum = isnull(sum(case when dr > 1 then OrderAmount end), 0)
from (
select
*, [year] = year(OrderDate), [month] = month(OrderDate)
, dr = dense_rank() over (partition by Customer order by dateadd(month, datediff(month, 0, OrderDate), 0))
from
@t
) t
group by [year], [month]
Output
year month NewCount NewSum RepeatCount RepeatSum
----------------------------------------------------------
2017 1 3 60 0 0
2018 1 0 0 1 70
2017 4 2 110 1 40
You must get combination of each year in the table with all months at first if you want to display months without orders. Then join with upper query
select
*
from
(select distinct y = year(OrderDate) from @t) t
cross join (values (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), (11), (12)) q(m)
Upvotes: 1
Reputation: 1270633
First, start by summarizing the data with one record per customer per month.
Then, you can use a self-join or similar construct to get the information you need:
with cm as (
select customer, dateadd(day, 1 - day(orderdate), orderdate) as yyyymm
sum(orderamount) as monthamount, count(*) as numorders
from orders
group by customer
)
select year(cm.yyyymm) as yr, month(cm.yyyymm) as mon,
sum(case when cm.num_orders > 0 and cm_prev.customer is null then 1 else 0 end) as new_count,
sum(case when cm.num_orders > 0 and cm_prev.customer is null then monthamount else 0 end) as new_amount,
sum(case when cm.num_orders > 0 and cm_prev.customer > 0 then 1 else 0 end) as repeat_count,
sum(case when cm.num_orders > 0 and cm_prev.customer > 0 then monthamount else 0 end) as repeat_amount
from cm left join
cm cm_prev
on cm.customer = cm_prev.customer and
cm.yyyymm = dateadd(month, 1, cm_prev.yyyymm)
group by year(cm.yyyymm), month(cm.yyyymm)
order by year(cm.yyyymm), month(cm.yyyymm);
This would be a bit easier in SQL Server 2012, where you can use lag()
.
Upvotes: 1