Billy Turtleneck
Billy Turtleneck

Reputation: 55

SQL Server query for new and repeat orders per month

I am working in SQL Server 2008 R2 and having a hard time gathering new vs repeat customer orders.

I have data in this format:

OrderID     OrderDate   Customer    OrderAmount
-----------------------------------------------
1           1/1/2017    A           $10
2           1/2/2017    B           $20
3           1/3/2017    C           $30
4           4/1/2017    C           $40
5           4/2/2017    D           $50
6           4/3/2017    D           $60
7           1/6/2018    B           $70

Here's what we want:

This means that if a new customer places multiple orders in her first month, they would all be considered "new" customer orders. And orders placed in subsequent months would all be considered "repeat" customer orders.

We want to get New orders (count and sum) and Repeat orders (count and sum) per year, per month:

Year    Month   NewCount        NewSum          RepeatCount         RepeatSum
-----------------------------------------------------------------------------
2017    1       3 (A,B,C)       $60 (10+20+30)  0                   $0
2017    4       2 (D,D)         $110 (50+60)    1 (C)               $40 (40)
2018    1       0               $0              1 (B)               $70 (70)

(The info in () parenthesis is not part of the result; just putting it here for clarity)

The SQL is easy to write for any single given month, but I don't know how to do it when gathering years worth of months at a time...

If there is a month with no orders of any kind then NULL or 0 values for the year:month would be preferred.

Upvotes: 1

Views: 2635

Answers (2)

uzi
uzi

Reputation: 4146

You can use dense_rank to find new and old customers. This query returns your provided output

declare @t table (OrderID int, OrderDate date, Customer char(1), OrderAmount int)
insert into @t
values  (1, '20170101', 'A', 10)
    , (2, '20170102', 'B', 20), (3, '20170103', 'C', 30)
    , (4, '20170401', 'C', 40), (5, '20170402', 'D', 50)
    , (6, '20170403', 'D', 60), (7, '20180106', 'B', 70)

select
    [year], [month], NewCount = isnull(sum(case when dr = 1 then 1 end), 0)
    , NewSum = isnull(sum(case when dr = 1 then OrderAmount end), 0)
    , RepeatCount = isnull(sum(case when dr > 1 then 1 end), 0)
    , RepeatSum = isnull(sum(case when dr > 1 then OrderAmount end), 0)
from (
    select 
        *, [year] = year(OrderDate), [month] = month(OrderDate)
        , dr = dense_rank() over (partition by Customer order by dateadd(month, datediff(month, 0, OrderDate), 0))
    from 
        @t
) t
group by [year], [month]

Output

year    month  NewCount   NewSum   RepeatCount   RepeatSum
----------------------------------------------------------
2017    1      3          60       0             0
2018    1      0          0        1             70
2017    4      2          110      1             40

You must get combination of each year in the table with all months at first if you want to display months without orders. Then join with upper query

select
    *
from
    (select distinct y = year(OrderDate) from @t) t
    cross join (values (1), (2), (3), (4), (5), (6), (7), (8), (9), (10), (11), (12)) q(m)

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1270633

First, start by summarizing the data with one record per customer per month.

Then, you can use a self-join or similar construct to get the information you need:

with cm as (
      select customer, dateadd(day, 1 - day(orderdate), orderdate) as yyyymm
             sum(orderamount) as monthamount, count(*) as numorders
      from orders
      group by customer
     )
select year(cm.yyyymm) as yr, month(cm.yyyymm) as mon,
       sum(case when cm.num_orders > 0 and cm_prev.customer is null then 1 else 0 end) as new_count,
       sum(case when cm.num_orders > 0 and cm_prev.customer is null then monthamount else 0 end) as new_amount,
       sum(case when cm.num_orders > 0 and cm_prev.customer > 0 then 1 else 0 end) as repeat_count,
       sum(case when cm.num_orders > 0 and cm_prev.customer > 0 then monthamount else 0 end) as repeat_amount
from cm left join
     cm cm_prev
     on cm.customer = cm_prev.customer and
        cm.yyyymm = dateadd(month, 1, cm_prev.yyyymm)
group by year(cm.yyyymm), month(cm.yyyymm)
order by year(cm.yyyymm), month(cm.yyyymm);

This would be a bit easier in SQL Server 2012, where you can use lag().

Upvotes: 1

Related Questions