Reputation: 23
I am working in MS SQL Server 2017 on counting member ED visits where a member may go months between ER visits or may have multiple consecutive days each with a visit. The rule that I am trying to calculate is this:
If a member has more than one ED visit in an 8-day period, include only the first eligible ED visit. For example, if a member has an eligible ED visit on January 1, include the January 1 visit and do not include ED visits that occur on or between January 2 and January 8. Then, if applicable, include the next eligible ED visit that occurs on or after January 9. Identify visits chronologically, including only one visit per 8-day period.
If the days since the last visit is NULL or >= 8 then that is always counted and I have that. The issue I am having is how to look at the running total of Days_Since_Last_Visit to find the next valid visit when there are multiple in a given 8 day period.
In the example below the rows marked in green are flagged for inclusion because the Days_Since_Last_Visit is NULL or >= 8.
The rows highlighted yellow are the ones that should be counted as the first valid visit in the next 8 day period. The bold outline shows the days that are adding up to reach the threshold of 8.
Example data with highlighted entries that should be counted
I have prepared the SQL for the example in the image hoping someone can help me get unstuck.
IF OBJECT_ID('tempdb..#Example') IS NOT NULL
DROP TABLE #Example
CREATE TABLE
#Example (
Subscriber_ID VARCHAR(16),
Member_Seq VARCHAR(2),
Measurement_Year INT,
Visit_Date DATETIME,
)
INSERT INTO #Example (Subscriber_ID,Member_Seq,Measurement_Year,Visit_Date) VALUES
('788768646','02','2019','2019-07-09'),
('788768646','02','2019','2019-08-05'),
('788768646','02','2019','2019-08-18'),
('788768646','02','2019','2019-09-13'),
('788768646','02','2019','2019-09-15'),
('788768646','02','2019','2019-09-19'),
('788768646','02','2019','2019-09-25'),
('788768646','02','2019','2019-10-14'),
('788768646','02','2019','2019-10-21'),
('788768646','02','2019','2019-10-24'),
('788768646','02','2019','2019-10-27'),
('788768646','02','2019','2019-10-28'),
('788768646','02','2019','2019-11-03'),
('788768646','02','2019','2019-11-06'),
('788768646','02','2019','2019-11-18'),
('788768646','02','2019','2019-12-11')
SELECT y.Subscriber_ID,
y.Member_Seq,
y.Measurement_Year,
y.Visit_Date,
y.Prior_Visit_Date,
y.Days_Since_Last_Visit,
CASE
WHEN Days_Since_Last_Visit >= 8 OR Days_Since_Last_Visit IS NULL THEN
'Y'
ELSE
NULL
END Include_Visist,
CASE
WHEN Days_Since_Last_Visit >= 8 OR Days_Since_Last_Visit IS NULL THEN
NULL
ELSE
SUM (CASE
WHEN Days_Since_Last_Visit >= 8 OR Days_Since_Last_Visit IS NULL THEN
NULL
ELSE
y.Days_Since_Last_Visit
END
) OVER (PARTITION BY y.Subscriber_ID, y.Member_Seq,y.Measurement_Year ORDER BY y.Visit_Date)
END Running_Total
FROM (
SELECT x.Subscriber_ID,
x.Member_Seq,
x.Measurement_Year,
x.Visit_Date,
LAG(Visit_Date) OVER (
PARTITION BY x.Subscriber_ID, x.Member_Seq, x.Measurement_Year
ORDER BY x.Visit_Date) Prior_Visit_Date,
DATEDIFF(DAY,
LAG(Visit_Date) OVER (
PARTITION BY x.Subscriber_ID, x.Member_Seq, x.Measurement_Year
ORDER BY x.Visit_Date),
x.Visit_Date) Days_Since_Last_Visit
FROM #Example x
) y
Upvotes: 2
Views: 124
Reputation: 1143
This was challenging. I added some additional test records to make sure it handled a long string of visits close together over several periods.
WITH Example as (
SELECT Subscriber_ID, Member_Seq, Measurement_Year, CAST(Visit_Date as datetime) as [Visit_Date]
FROM (
VALUES
('788768646','02','2019','2019-06-01'),
('788768646','02','2019','2019-06-09'),
('788768646','02','2019','2019-07-09'),
('788768646','02','2019','2019-08-05'),
('788768646','02','2019','2019-08-18'),
('788768646','02','2019','2019-09-13'),
('788768646','02','2019','2019-09-15'),
('788768646','02','2019','2019-09-19'),
('788768646','02','2019','2019-09-25'),
('788768646','02','2019','2019-10-14'),
('788768646','02','2019','2019-10-21'),
('788768646','02','2019','2019-10-24'),
('788768646','02','2019','2019-10-27'),
('788768646','02','2019','2019-10-28'),
('788768646','02','2019','2019-11-03'),
('788768646','02','2019','2019-11-06'),
('788768646','02','2019','2019-11-18'),
('788768646','02','2019','2019-12-11'),
('788768646','02','2020','2020-01-01'),
('788768646','02','2020','2020-01-08'),
('788768646','02','2020','2020-01-09'),
('788768646','02','2020','2020-01-16'),
('788768646','02','2020','2020-01-17'),
('788768646','02','2020','2020-01-24'),
('788768646','02','2020','2020-01-25'),
('788768699','02','2019','2019-06-06'),
('788768699','02','2019','2019-06-07'),
('788768699','02','2019','2019-07-17'),
('788768699','02','2019','2019-08-23')
) t (Subscriber_ID, Member_Seq, Measurement_Year, Visit_Date)
), AllVisits as (
SELECT *,
ROW_NUMBER() OVER(PARTITION BY Subscriber_ID ORDER BY Visit_Date) as [VisitSeq]
FROM Example
), Report as (
SELECT *, Visit_Date as [PeriodStart]
FROM AllVisits
WHERE VisitSeq = 1
UNION ALL
SELECT a.*,
-- if not in the prior period, start a new 8 day period
CASE WHEN a.Visit_Date > r.PeriodStart + 7 THEN a.Visit_Date ELSE r.PeriodStart END
FROM AllVisits a
INNER JOIN Report r
ON r.Subscriber_ID = a.Subscriber_ID AND r.[VisitSeq] + 1 = a.[VisitSeq]
)
select *,
PeriodStart + 7 AS [PeriodEnD],
CASE WHEN Visit_Date = PeriodStart THEN 1 ELSE 0 END as [IsFirstDayOfPeriod]
from Report
where Visit_Date = PeriodStart
order by Subscriber_ID, Visit_Date
The query below gets most of desired records, but not all of them. (This is where it misses a long string of close visits.) I wanted to start here, but but I could not use a subquery or a group by in a recursive part of the CTE. I would have to do something like the above, but the anchor would need both the sequence start and finish. Then recursion for each anchor record would be within it's range. I might try it someday.
SELECT *
FROM Example e
WHERE NOT EXISTS( -- those with no prior within 8 days
SELECT *
FROM Example x
WHERE x.Subscriber_ID = e.Subscriber_ID
AND x.Visit_Date < e.Visit_Date -- prior
AND x.Visit_Date > e.Visit_Date - 8 -- within 8 days
)
Upvotes: 2