Reputation: 338
Say I have a Payment table. I need to know the number of times the gap between payments is greater than 90 days grouped by personID. Payment frequency varies. There is no expected number of payments. There could 0 or hundreds of payments in a 90 day period. If there was no payment for a year, that would count as 1. If there was a payment every month, the count would be 0. If there were 4 payments the first month, then a 90 day gap, then 2 more payments, then another 90 day gap, the count would be 2.
CREATE TABLE Payments
(
ID int PRIMARY KEY,
PersonID int FOREIGN KEY REFERENCES Persons(ID),
CreateDate datetime
)
Upvotes: 1
Views: 1158
Reputation: 24410
If you have SQL Server 2014, you can use the LAG
or LEAD
function to peek at other rows, making this easy:
Select PersonId, Sum(InfrequentPayment) InfrequentPayments
from
(
select PersonId
, case
when dateadd(day,@period,paymentdate) < coalesce(lead(PaymentDate) over (partition by personid order by PaymentDate),getutcdate())
then 1
else 0
end InfrequentPayment
from @Payment
) x
Group by PersonId
Demo: http://sqlfiddle.com/#!6/9eecb7d/491
Explanation:
The outer SQL is fairly trivial; we take the results of the inner SQL, group by PersonId, and count/sum the number of times they've paid payment judged as Infrequent.
The inner SQL is also simple; we're selecting every record, making a note of the person and whether that payment (or rather the delay after that payment) was judged infrequent.
The case statement determines what constitutes an infrequent payment. Here we say that if the record's paymentdate plus 90 days is still earlier than the next payment (or current date if it's the last payment, so there's no next payment) then it's infrequent (1); otherwise it's not (0).
The coalesce
is simply there to handle the last record for a person; i.e. so that if there is no next payment the current date is used (thus capturing anyone who's last payment was over 90 days before today).
Now for the "clever" bit: lead(PaymentDate) over (partition by personid order by PaymentDate)
.
LEAD
is a new SQL function which lets you look at the record after the current one (LAG
is to see the previous record).
If you're familiar with row_number()
or rank()
you may already understand what's going on here.
To determine the record after the current one we don't look at the current query though; rather we specify an order by
clause just for this function; that's what's in the brackets after the over
keyword.
We also want to only compare each person's payment dates with other payments made by them; not by any customer. To achieve that we use the partition by
clause.
I hope that makes sense / meets your requirement. Please say if anything's unclear and I'll try to improve my explanation.
EDIT
For older versions of SQL, the same effect can be achieved by use or ROW_NUMBER
and a LEFT OUTER JOIN
; i.e.
;with cte (PersonId, PaymentDate, SequenceNo) as
(
select PersonId
, PaymentDate
, ROW_NUMBER() over (partition by PersonId order by PaymentDate)
from @Payment
)
select a.PersonId
, sum(case when dateadd(day,@period,a.paymentdate) < coalesce(b.paymentdate,getutcdate()) then 1 else 0 end) InfrequentPayments
from cte a
left outer join cte b
on b.PersonId = a.PersonId
and b.SequenceNo = a.SequenceNo + 1
Group by a.PersonId
Another method which should work on most databases (though less efficient)
select PersonId
, sum(InfrequentPayment) InfrequentPayments
from
(
select PersonId
, case when dateadd(day,@period,paymentdate) < coalesce((
select min(PaymentDate)
from @Payment b
where b.personid = a.personid
and b.paymentdate > a.paymentdate
),getutcdate()) then 1 else 0 end InfrequentPayment
from @Payment a
) x
Group by PersonId
Upvotes: 3
Reputation: 6969
Generic query for this problem given a timestamp
field would be something like this:
SELECT p1.personID, COUNT(*)
FROM payments p1
JOIN payments p2 ON
p1.timestamp < p2.timestamp
AND p1.personID = p2.personID
AND NOT EXISTS (-- exclude combinations of p1 and p2 where p exists between them
SELECT * FROM payments p
WHERE p.personID = p1.personID
AND p.timestamp > p1.timestamp
AND p.timestamp < p2.timestamp)
WHERE
DATEDIFF(p2.timestamp, p1.timestamp) >= 90
GROUP BY p1.personID
Upvotes: 0