divya.trehan573
divya.trehan573

Reputation: 564

QUERY to find gap in dates in a table

I have a table with effective start date and end date that is history data for employees. This table should contain all the details of th employee.

I need to find out through a query wherever there are breaks between dates. For example :

Table ABC :

EMP_NO  EFF_START_DATE   EFF_END_DATE          ORG             NAME          DOB          STATUS  
1       01-JAN-2010       28-MAR-2010         XYZ             SMITH         10-JAN-1990   SINGLE   
1       29-MAR-2010       29-AUG-2010         XYZ             SMITH         10-JAN-1990   MARRIED
1       20-OCT-2010       31-DEC-4712         XYZ             SMITH         10-JAN-1990   DIVORCEE 

2       04-FEB-2010       28-MAR-2010         XYZ             JOHN        10-JAN-1991   SINGLE   
2       29-MAR-2010       31-DEC-4712         XYZ             JOHN        10-JAN-1991   MARRIED 

3       02-FEB-2010       21-MAR-2010         XYZ             GEETA        10-JAN-1991   SINGLE   
3       29-MAR-2010       31-DEC-4712         XYZ             GEETA        10-JAN-1991   MARRIED 

Now for EMP NO 1 and 3 there is a gap. For example for emp no 1 after 29-aug-2010 and before 20-oct-2010 there should have been a record. similarly in emp no. 3 there should have been a records betwwen 21-mar-2010 to 29-mar-2010.

what query can i write for this

Upvotes: 2

Views: 697

Answers (3)

Boneist
Boneist

Reputation: 23578

The easiest way of doing this is to compare the end date of a row with the start date of the next. You can do this easily by using the LEAD analytic function, like so:

with abc as (select 1 emp_no, to_date('01/01/2010', 'dd/mm/yyyy') eff_start_date, to_date('28/03/2010', 'dd/mm/yyyy') eff_end_date, 'XYZ' org, 'SMITH' name, to_date('10/01/1990', 'dd/mm/yyyy') dob, 'SINGLE' status from dual union all
             select 1 emp_no, to_date('29/03/2010', 'dd/mm/yyyy') eff_start_date, to_date('29/08/2010', 'dd/mm/yyyy') eff_end_date, 'XYZ' org, 'SMITH' name, to_date('10/01/1990', 'dd/mm/yyyy') dob, 'MARRIED' status from dual union all
             select 1 emp_no, to_date('20/10/2010', 'dd/mm/yyyy') eff_start_date, to_date('31/12/4712', 'dd/mm/yyyy') eff_end_date, 'XYZ' org, 'SMITH' name, to_date('10/01/1990', 'dd/mm/yyyy') dob, 'DIVORCEE' status from dual union all
             select 2 emp_no, to_date('04/02/2010', 'dd/mm/yyyy') eff_start_date, to_date('28/03/2010', 'dd/mm/yyyy') eff_end_date, 'XYZ' org, 'JOHN' name, to_date('10/01/1990', 'dd/mm/yyyy') dob, 'SINGLE' status from dual union all
             select 2 emp_no, to_date('29/03/2010', 'dd/mm/yyyy') eff_start_date, to_date('31/12/4712', 'dd/mm/yyyy') eff_end_date, 'XYZ' org, 'JOHN' name, to_date('10/01/1990', 'dd/mm/yyyy') dob, 'MARRIED' status from dual union all
             select 3 emp_no, to_date('02/02/2010', 'dd/mm/yyyy') eff_start_date, to_date('21/03/2010', 'dd/mm/yyyy') eff_end_date, 'XYZ' org, 'GEETA' name, to_date('10/01/1990', 'dd/mm/yyyy') dob, 'SINGLE' status from dual union all
             select 3 emp_no, to_date('29/03/2010', 'dd/mm/yyyy') eff_start_date, to_date('31/12/4712', 'dd/mm/yyyy') eff_end_date, 'XYZ' org, 'GEETA' name, to_date('10/01/1990', 'dd/mm/yyyy') dob, 'MARRIED' status from dual)
-- end of mimicking your abc table; you won't need the above subquery, as you already have a table called abc.
select emp_no,
       eff_end_date + 1 gap_start_date,
       next_eff_start_date - 1 gap_end_date,
       org,
       name,
       'UKNOWN' status
from   (select emp_no,
               eff_start_date,
               eff_end_date,
               lead(eff_start_date) over (partition by emp_no order by eff_start_date) next_eff_start_date,
               org,
               name,
               dob,
               status
        from   abc)
where  next_eff_start_date - eff_end_date > 1;

    EMP_NO GAP_START_DATE GAP_END_DATE ORG NAME  STATUS
---------- -------------- ------------ --- ----- ------
         1 30-AUG-2010    19-OCT-2010  XYZ SMITH UKNOWN
         3 22-MAR-2010    28-MAR-2010  XYZ GEETA UKNOWN

N.B. You didn't say the kind of output you were expecting to see, so I've given you the start and end dates of the gap.

Also, like Lalit, I've used the subquery in the WITH clause to generate sample data. You would not need that subquery, since you already have an "abc" table.

Upvotes: 1

Emil Moise
Emil Moise

Reputation: 393

You can also find the solution by comparing, for each row, the start_date with the previous end_date for the same employee:

select 
    src.*, 
    src.start_date - src.prev_end_date gap_days 
  from (
    select 
        out_tab.emp_no, 
        (select max(in_tab.end_date) from ABC in_tab where in_tab.emp_no = out_tab.emp_no and in_tab.end_date < out_tab.start_date) prev_end_date, 
        out_tab.start_date 
      from 
        ABC out_tab 
  ) src
  where 
    start_date - prev_end_date > 1;


    EMP_NO PREV_END_DATE START_DATE   GAP_DAYS
---------- ------------- ---------- ----------
         1 29-AUG-10     20-OCT-10          52 
         3 21-MAR-10     29-MAR-10           8 

Upvotes: 2

Lalit Kumar B
Lalit Kumar B

Reputation: 49082

It is a typical Gaps and Islands problem. You need to find out the missing values in the sequence of dates.

For example,

SQL> WITH sample_data(dates) AS(
  2  SELECT  DATE '2015-01-01' FROM dual UNION
  3  SELECT  DATE '2015-01-02' FROM dual UNION
  4  SELECT  DATE '2015-01-03' FROM dual UNION
  5  SELECT  DATE '2015-01-05' FROM dual UNION
  6  SELECT  DATE '2015-01-06' FROM dual UNION
  7  SELECT  DATE '2015-01-07' FROM dual UNION
  8  SELECT  DATE '2015-01-10' FROM dual UNION
  9  SELECT  DATE '2015-01-11' FROM dual UNION
 10  SELECT  DATE '2015-01-12' FROM dual UNION
 11  SELECT  DATE '2015-01-13' FROM dual UNION
 12  SELECT  DATE '2015-01-20' FROM dual
 13  )
 14 -- end of sample_data mimicking real table
 15  SELECT MIN(missing_dates),
 16         MAX(missing_dates)
 17  FROM
 18    (SELECT missing_dates,
 19   missing_dates - row_number() OVER(ORDER BY missing_dates) rn
 20    FROM
 21   (SELECT min_date - 1 + LEVEL missing_dates
 22   FROM
 23  ( SELECT MIN(dates) min_date , MAX(dates) max_date FROM sample_data
 24  )
 25  CONNECT BY level <= max_date - min_date + 1
 26   MINUS
 27   SELECT dates FROM sample_data
 28   ) )
 29  GROUP BY rn ORDER BY rn;

MIN(MISSING_DATES) MAX(MISSING_DATES)
------------------ ------------------
2015-01-04         2015-01-04
2015-01-08         2015-01-09
2015-01-14         2015-01-19

NOTE

The WITH clause is only for building sample data for demonstration because you did not provide the create and insert statements. In reality you need to use your own table_name instead of sample_data.

Upvotes: 2

Related Questions