Sean R
Sean R

Reputation: 28

Count multiple repeats after event as single repeat

What I'm trying to do in come up with a single query that can give the percentage of repeats within 30 days of an initial event, but only count any events within 30 days as a single repeat. Here's a sample data set for a single person:

Person       Date
══════════════
A                3/1/14
A                3/21/14
A                3/29/14
A                4/14/14
A                4/17/14

In this case, 3/21 would be the repeat event, and 3/29 wouldn't be counted as a second. 4/14 would be the start of the next window, with 4/17 being the second repeat.

To calculate the percentage of repeats here, the numerator would be the distinct count of people who had an initial event in the month and also had a subsequent event within 30 days. The denominator is a distinct count of people with events in that month. In the case of crossing months, the repeat is counted within the month of the initial event.

I know I could come up with something that uses a loop/cursor or temp table, but as the data set grows, it's going to take forever. Does anyone have any thoughts on how to do this as a single query? It's probably going to involve a couple of CTE's. Everything I've come up with so far has failed.

Upvotes: 1

Views: 206

Answers (1)

Kevin Suchlicki
Kevin Suchlicki

Reputation: 3145

Nice one... try this:

create table #t (Person varchar(10), EventDate date);
insert #t (Person, EventDate)
values
('A', '3/1/14'),
('A', '3/21/14'),
('A', '3/29/14'),
('A', '4/14/14'),
('A', '4/17/14'),
('A', '8/3/14'),
('B', '3/25/14'),
('B', '4/2/14'),
('B', '4/20/14'),
('B', '6/14/14'),
('B', '8/17/14'),
('B', '8/26/14');
;WITH OrderedEvents AS (
    SELECT  Person, EventDate, ROW_NUMBER() OVER (PARTITION BY Person ORDER BY EventDate) AS Ord
    FROM    #t
)
, RepeatedEvents AS (
    SELECT  Person, EventDate, Ord, EventDate AS InitialDate
    FROM    OrderedEvents
    WHERE   Ord = 1
    UNION ALL
    SELECT  o.Person, o.EventDate, o.Ord
            , CASE WHEN DATEDIFF(DAY, r.InitialDate, o.EventDate) > 30 THEN o.EventDate ELSE r.InitialDate END
    FROM    OrderedEvents o
            JOIN RepeatedEvents r ON o.Person = r.Person AND o.Ord = r.Ord + 1
)
, GroupedEvents AS (
    SELECT  Person, MONTH(InitialDate) AS Mth, YEAR(InitialDate) AS Yr
            , IsRepeat = CASE WHEN COUNT(*) > 1 THEN 1 ELSE 0 END
    FROM    RepeatedEvents
    GROUP BY Person, MONTH(InitialDate), YEAR(InitialDate)
)
SELECT  Mth, Yr, CAST(SUM(IsRepeat) AS NUMERIC) / CAST(COUNT(DISTINCT person) AS NUMERIC) AS Pct
FROM    GroupedEvents 
GROUP BY Mth, Yr;

Upvotes: 1

Related Questions