Reputation: 47
How to get optimized query for this
date_one | date_two
------------------------
01.02.1999 | 31.05.2003
01.01.2004 | 01.01.2010
02.01.2010 | 10.10.2011
11.10.2011 | (null)
I need to get this
date_one | date_two | group
------------------------------------
01.02.1999 | 31.05.2003 | 1
01.01.2004 | 01.01.2010 | 2
02.01.2010 | 10.10.2011 | 2
11.10.2011 | (null) | 2
The group number is assigned as follows. Order the rows by date_one
ascending. First row gets group = 1
. Then for each row if date_one
is the date immediately following date_two
of the previous row, the group
number stays the same as in the previous
row, otherwise it increases by one.
Upvotes: 0
Views: 47
Reputation: 39527
One way is using window function:
select
date_one,
date_two,
sum(x) over (order by date_one) grp
from (
select
t.*,
case when
lag(date_two) over (order by date_one) + 1 =
date_one then 0 else 1 end x
from t
);
It finds the date_two from the last row using analytic function lag
and check if it in continuation with date_one from this row (in increasing order of date_one).
lag(date_two) over (order by date_one)
(In the below explanation, when I say first, next, previous or last row, it's based on increasing order of date_one with null values at the end)
The above produces produces NULL
for the first row as there is no row before it to get date_two from and previous row's date_two for the subsequent rows.
case when
lag(date_two)
over (order by date_one) + 1 = date_one then 0
else 1 end
Since, the lag produces NULL
for the very first row (since NULL = anything
expression always finally evaluates to false), output of case will be 1.
For further rows, similar check will be done to produce a new column x
in the query output which has value 1 when the previous row's date_two is not in continuation with this row's date_one.
Then finally, we can do an incremental sum on x
to find the required group values. See the value of x below for understanding:
SQL> with t (date_one,date_two) as (
2 select to_date('01.02.1999','dd.mm.yyyy'),to_date('31.05.2003','dd.mm.yyyy') from dual union
all
3 select to_date('01.01.2004','dd.mm.yyyy'),to_date('01.01.2010','dd.mm.yyyy') from dual union
all
4 select to_date('02.01.2010','dd.mm.yyyy'),to_date('10.10.2011','dd.mm.yyyy') from dual union
all
5 select to_date('11.10.2011','dd.mm.yyyy'),null from dual
6 )
7 select
8 date_one,
9 date_two,
10 x,
11 sum(x) over (order by date_one) grp
12 from (
13 select
14 t.*,
15 case when
16 lag(date_two) over (order by date_one) + 1 =
17 date_one then 0 else 1 end x
18 from t
19 );
DATE_ONE DATE_TWO X GRP
--------- --------- ---------- ----------
01-FEB-99 31-MAY-03 1 1
01-JAN-04 01-JAN-10 1 2
02-JAN-10 10-OCT-11 0 2
11-OCT-11 0 2
SQL>
Upvotes: 0
Reputation: 1271023
You can do this using left join
and a cumulative sum:
select t.*, sum(case when tprev.date_one is null then 1 else 0 end) over (order by t.date_one) as grp
from t left join
t tprev
on t.date_one = tprev.date_two + 1;
The idea is to find where the gaps begin (using the left join) and then do a cumulative sum of such beginnings to define the group.
If you want to be more inscrutable, you could write this as:
select t.*,
count(*) over (order by t.date_one) - count(tprev.date_one) over (order by t.date_one) as grp
from t left join
t tprev
on t.date_one = tprev.date_two + 1;
Upvotes: 2