SQL count of 90 day gaps between records

Question

Say I have a Payment table. I need to know the number of times the gap between payments is greater than 90 days grouped by personID. Payment frequency varies. There is no expected number of payments. There could 0 or hundreds of payments in a 90 day period. If there was no payment for a year, that would count as 1. If there was a payment every month, the count would be 0. If there were 4 payments the first month, then a 90 day gap, then 2 more payments, then another 90 day gap, the count would be 2.

CREATE TABLE Payments
(
    ID int PRIMARY KEY,
    PersonID int FOREIGN KEY REFERENCES Persons(ID),
    CreateDate datetime
)

JohnLBevan · Accepted Answer

If you have SQL Server 2014, you can use the LAG or LEAD function to peek at other rows, making this easy:

Select PersonId, Sum(InfrequentPayment) InfrequentPayments
from
(
    select PersonId
    , case 
        when dateadd(day,@period,paymentdate) < coalesce(lead(PaymentDate) over (partition by personid order by PaymentDate),getutcdate()) 
        then 1 
        else 0 
      end InfrequentPayment
    from @Payment
) x
Group by PersonId

Demo: http://sqlfiddle.com/#!6/9eecb7d/491

Explanation:

The outer SQL is fairly trivial; we take the results of the inner SQL, group by PersonId, and count/sum the number of times they've paid payment judged as Infrequent.

The inner SQL is also simple; we're selecting every record, making a note of the person and whether that payment (or rather the delay after that payment) was judged infrequent.

The case statement determines what constitutes an infrequent payment. Here we say that if the record's paymentdate plus 90 days is still earlier than the next payment (or current date if it's the last payment, so there's no next payment) then it's infrequent (1); otherwise it's not (0).

The coalesce is simply there to handle the last record for a person; i.e. so that if there is no next payment the current date is used (thus capturing anyone who's last payment was over 90 days before today).

Now for the "clever" bit: lead(PaymentDate) over (partition by personid order by PaymentDate). LEAD is a new SQL function which lets you look at the record after the current one (LAG is to see the previous record).
If you're familiar with row_number() or rank() you may already understand what's going on here. To determine the record after the current one we don't look at the current query though; rather we specify an order by clause just for this function; that's what's in the brackets after the over keyword. We also want to only compare each person's payment dates with other payments made by them; not by any customer. To achieve that we use the partition by clause.

I hope that makes sense / meets your requirement. Please say if anything's unclear and I'll try to improve my explanation.

EDIT

For older versions of SQL, the same effect can be achieved by use or ROW_NUMBER and a LEFT OUTER JOIN; i.e.

;with cte (PersonId, PaymentDate, SequenceNo) as
(
    select PersonId
    , PaymentDate
    , ROW_NUMBER() over (partition by PersonId order by PaymentDate)
    from @Payment
)
select a.PersonId
, sum(case when dateadd(day,@period,a.paymentdate) < coalesce(b.paymentdate,getutcdate()) then 1 else 0 end) InfrequentPayments
from cte a
left outer join cte b 
on b.PersonId = a.PersonId 
and b.SequenceNo = a.SequenceNo + 1
Group by a.PersonId

Another method which should work on most databases (though less efficient)

select PersonId
, sum(InfrequentPayment) InfrequentPayments
from 
(
    select PersonId
    , case when dateadd(day,@period,paymentdate) < coalesce((
        select min(PaymentDate) 
        from @Payment b 
        where b.personid = a.personid
        and b.paymentdate > a.paymentdate
    ),getutcdate()) then 1 else 0 end InfrequentPayment
    from @Payment a
) x
Group by PersonId

SQL count of 90 day gaps between records

Answers (2)

Related Questions