Reputation: 2602
With the below sample data, I am trying to group record with same rate.
id start_date end_date rate
-----------------------------------------------------------------
1 01/01/2017 12:00:00 am 01/01/2017 12:00:00 am 300
1 02/01/2017 12:00:00 am 02/01/2017 12:00:00 am 300
1 03/01/2017 12:00:00 am 03/01/2017 12:00:00 am 300
1 04/01/2017 12:00:00 am 04/01/2017 12:00:00 am 1000
1 05/01/2017 12:00:00 am 05/01/2017 12:00:00 am 500
1 06/01/2017 12:00:00 am 06/01/2017 12:00:00 am 500
1 07/01/2017 12:00:00 am 07/01/2017 12:00:00 am 1000
1 08/01/2017 12:00:00 am 08/01/2017 12:00:00 am 1000
1 09/01/2017 12:00:00 am 09/01/2017 12:00:00 am 300
What I've tried :
select distinct id, mn_date, mx_date,rate
from (
select id, min(start_date) over (partition by grp order by start_date) mn_date,
max(end_date) over(partition by grp order by start_date desc) mx_date, rate
from (
select t.*, row_number() over(partition by id order by start_date) -row_number() over(partition by rate order by start_date)grp
from t
)
)
order by mn_date;
Output :
id mn_date mx_date rate
--------------------------------------------------------
1 01/01/2017 12:00:00 am 03/01/2017 12:00:00 am 300
1 04/01/2017 12:00:00 am 04/01/2017 12:00:00 am 1000
1 05/01/2017 12:00:00 am 06/01/2017 12:00:00 am 500
1 07/01/2017 12:00:00 am 09/01/2017 12:00:00 am 300
1 07/01/2017 12:00:00 am 09/01/2017 12:00:00 am 1000
Desired Output:
id mn_date mx_date rate
--------------------------------------------------------
1 01/01/2017 12:00:00 am 03/01/2017 12:00:00 am 300
1 04/01/2017 12:00:00 am 04/01/2017 12:00:00 am 1000
1 05/01/2017 12:00:00 am 06/01/2017 12:00:00 am 500
1 07/01/2017 12:00:00 am 08/01/2017 12:00:00 am 1000
1 09/01/2017 12:00:00 am 09/01/2017 12:00:00 am 300
Final result to group by consecutive dates: (Thanks to Gordon )
select id, min(start_date), max(end_date), rate
from (
select id, start_date, end_date, rate, seqnum_i-seqnum_ir grp, sum(x) over(partition by id order by start_date) grp1
from (
select t.*,
row_number() over (partition by id order by start_date) as seqnum_i,
row_number() over (partition by id, rate order by start_date) as seqnum_ir,
case when LEAD(start_date) over (partition by id order by start_date)= end_date + 1
then 0
else 1
end x
from t
)
)
group by id, grp+grp1, rate
order by min(start_date);
Upvotes: 2
Views: 2755
Reputation: 49
I found that the last value wasn't being grouped correctly as the calculation of X wasn't handling the NULL return, so I changed it to this:
,CASE
WHEN LEAD (start_date)
OVER (PARTITION BY id ORDER BY start_date)
IS NULL
THEN
0
WHEN LEAD (start_date)
OVER (PARTITION BY id ORDER BY start_date) =
end_date + 1
THEN
0
ELSE
1
END
x
Upvotes: 0
Reputation: 1270873
Assuming we can just use start_date
to identify the adjacent records (i.e., there are no gaps), then you can use the difference of row numbers approach:
select id, min(start_date) as mn_date, max(end_date) as mx_date, rate
from (select t.*,
row_number() over (partition by id order by start_date) as seqnum_i,
row_number() over (partition by id, rate order by start_date) as seqnum_ir
from t
) t
group by id (seqnum_i - seqnum_ir), rate;
To see how this works, look at the results of the subquery. You should be able to "see" how the difference of the two row numbers defines the groups of adjacent records with the same rate.
Upvotes: 1