Reputation: 8917
I've got a SQL table where I want to find the first and last dates of a group of records, providing they're sequential.
Patient | TestType | Result | Date
------------------------------------------
1 | 1 | A | 2012-03-04
1 | 1 | A | 2012-08-19
1 | 1 | B | 2013-05-27
1 | 1 | A | 2013-06-20
1 | 2 | X | 2012-08-19
1 | 2 | X | 2013-06-20
2 | 1 | B | 2014-09-09
2 | 1 | B | 2015-04-19
Should be returned as
Patient | TestType | Result | StartDate | EndDate
--------------------------------------------------------
1 | 1 | A | 2012-03-04 | 2012-08-19
1 | 1 | B | 2013-05-27 | 2013-05-27
1 | 1 | A | 2013-06-20 | 2013-06-20
1 | 2 | X | 2012-08-19 | 2013-06-20
2 | 1 | B | 2014-09-09 | 2015-04-19
The problem is that if I just group by Patient
, TestType
, and Result
,
then the first and third rows in the example above would become a single row.
Patient | TestType | Result | StartDate | EndDate
--------------------------------------------------------
1 | 1 | A | 2012-03-04 | 2013-06-20
1 | 1 | B | 2013-05-27 | 2013-05-27
1 | 2 | X | 2012-08-19 | 2013-06-20
2 | 1 | B | 2014-09-09 | 2015-04-19
I feel like there's got to be something clever I can do with a partition, but I can't quite figure out what it is.
Upvotes: 1
Views: 84
Reputation: 7184
See if this gives you what you need.
with T1 as (
select
*,
case when lag(Patient,1)
over (order by Patient, TestType, Result) = Patient
and lag(TestType,1)
over (order by Patient, TestType, Result) = TestType
and lag(Result,1)
over (order by Patient, TestType, Result) = Result
then null else 1 end as Changes
from t
), T2 as (
select
Patient,
TestType,
Result,
dt,
sum(Changes) over (
order by Patient, TestType, Result, dt
) as seq
from T1
)
select
Patient,
TestType,
Result,
min(dt) as dtFrom,
max(dt) as dtTo
from T2
group by Patient, TestType, Result, seq
order by Patient, TestType, Result
Upvotes: 0
Reputation: 1269683
There are several ways to approach this. I like identifying the groups using the difference of row number values:
select patient, testtype, result,
min(date) as startdate, max(date) as enddate
from (select t.*,
(row_number() over (partition by patient, testtype order by date) -
row_number() over (partition by patient, testtype, result order by date)
) as grp
from table t
) t
group by patient, testtype, result, grp
order by patient, startdate;
Upvotes: 2
Reputation: 49260
select patient, testtype, result, date as startdate,
isnull(lead(date) over(partition by patient, testtype, result order by date), date) as enddate
from tablename;
You can use lead
function to get the value of date (as enddate) from the next row in each group.
SQL Fiddle with sample data.
Upvotes: 0