Marijan Gregurić
Marijan Gregurić

Reputation: 47

oracle dates group

How to get optimized query for this

date_one   | date_two 
------------------------
01.02.1999 | 31.05.2003
01.01.2004 | 01.01.2010
02.01.2010 | 10.10.2011 
11.10.2011 | (null)     

I need to get this

date_one   | date_two   | group 
------------------------------------
01.02.1999 | 31.05.2003 | 1
01.01.2004 | 01.01.2010 | 2
02.01.2010 | 10.10.2011 | 2
11.10.2011 | (null)     | 2

The group number is assigned as follows. Order the rows by date_one ascending. First row gets group = 1. Then for each row if date_one is the date immediately following date_two of the previous row, the group number stays the same as in the previous row, otherwise it increases by one.

Upvotes: 0

Views: 47

Answers (2)

Gurwinder Singh
Gurwinder Singh

Reputation: 39527

One way is using window function:

select
  date_one,
  date_two,
  sum(x) over (order by date_one) grp
from (
    select
        t.*,
        case when 
        lag(date_two) over (order by date_one) + 1 =
            date_one then 0 else 1 end x
    from t
);

It finds the date_two from the last row using analytic function lag and check if it in continuation with date_one from this row (in increasing order of date_one).

How it works:

lag(date_two) over (order by date_one)

(In the below explanation, when I say first, next, previous or last row, it's based on increasing order of date_one with null values at the end)

The above produces produces NULL for the first row as there is no row before it to get date_two from and previous row's date_two for the subsequent rows.

case when 
      lag(date_two) 
      over (order by date_one) + 1 = date_one then 0
     else 1 end

Since, the lag produces NULL for the very first row (since NULL = anything expression always finally evaluates to false), output of case will be 1.

For further rows, similar check will be done to produce a new column x in the query output which has value 1 when the previous row's date_two is not in continuation with this row's date_one.

Then finally, we can do an incremental sum on x to find the required group values. See the value of x below for understanding:

SQL> with t (date_one,date_two) as (
  2     select to_date('01.02.1999','dd.mm.yyyy'),to_date('31.05.2003','dd.mm.yyyy') from dual union
 all
  3     select to_date('01.01.2004','dd.mm.yyyy'),to_date('01.01.2010','dd.mm.yyyy') from dual union
 all
  4     select to_date('02.01.2010','dd.mm.yyyy'),to_date('10.10.2011','dd.mm.yyyy') from dual union
 all
  5     select to_date('11.10.2011','dd.mm.yyyy'),null from dual
  6  )
  7  select
  8    date_one,
  9    date_two,
 10    x,
 11    sum(x) over (order by date_one) grp
 12  from (
 13     select
 14             t.*,
 15             case when
 16             lag(date_two) over (order by date_one) + 1 =
 17                     date_one then 0 else 1 end x
 18     from t
 19  );

DATE_ONE  DATE_TWO           X        GRP
--------- --------- ---------- ----------
01-FEB-99 31-MAY-03          1          1
01-JAN-04 01-JAN-10          1          2
02-JAN-10 10-OCT-11          0          2
11-OCT-11                    0          2

SQL>

Upvotes: 0

Gordon Linoff
Gordon Linoff

Reputation: 1271023

You can do this using left join and a cumulative sum:

select t.*, sum(case when tprev.date_one is null then 1 else 0 end) over (order by t.date_one) as grp
from t left join
     t tprev
     on t.date_one = tprev.date_two + 1;

The idea is to find where the gaps begin (using the left join) and then do a cumulative sum of such beginnings to define the group.

If you want to be more inscrutable, you could write this as:

select t.*,
       count(*) over (order by t.date_one) - count(tprev.date_one) over (order by t.date_one) as grp
from t left join
     t tprev
     on t.date_one = tprev.date_two + 1;

Upvotes: 2

Related Questions