manj
manj

Reputation: 19

Identify contiguous and discontinuous date ranges

I have a table named x . The data is as follows.

Acccount_num    start_dt    end_dt
A111326      02/01/2016    02/11/2016
A111326      02/12/2016    03/05/2016
A111326      03/02/2016   03/16/2016
A111331      02/28/2016   02/29/2016
A111331      02/29/2016   03/29/2016
A999999      08/25/2015   08/25/2015
A999999      12/19/2015   12/22/2015
A222222      11/06/2015   11/10/2015
A222222      05/16/2016   05/17/2016

Both A111326 and A111331 should be identified as contiguous data and A999999 and
A222222 should be identified as discontinuous data.In my code I currently use the following query to identify discontinuous data. The A111326 is also erroneously identified as discontinuous data. Please help to modify the below code so that A111326 is not identified as discontinuous data.Thanks in advance for your help.

(SELECT account_num
                FROM (SELECT account_num,
                             (MAX (
                                 END_DT)
                              OVER (PARTITION BY account_num
                                    ORDER BY START_DT))
                                START_DT,
                             (LEAD (
                                 START_DT)
                              OVER (PARTITION BY account_num
                                    ORDER BY START_DT))
                                END_DT
                        FROM x
                         WHERE (START_DT + 1) <=
                                (END_DT - 1))
               WHERE START_DT < END_DT);

Upvotes: 0

Views: 1235

Answers (2)

MT0
MT0

Reputation: 168051

Oracle Setup:

CREATE TABLE accounts ( Account_num, start_dt, end_dt ) AS
SELECT 'A', DATE '2016-02-01', DATE '2016-02-11' FROM DUAL UNION ALL
SELECT 'A', DATE '2016-02-12', DATE '2016-03-05' FROM DUAL UNION ALL
SELECT 'A', DATE '2016-03-02', DATE '2016-03-16' FROM DUAL UNION ALL
SELECT 'B', DATE '2016-02-28', DATE '2016-02-29' FROM DUAL UNION ALL
SELECT 'B', DATE '2016-02-29', DATE '2016-03-29' FROM DUAL UNION ALL
SELECT 'C', DATE '2015-08-25', DATE '2015-08-25' FROM DUAL UNION ALL
SELECT 'C', DATE '2015-12-19', DATE '2015-12-22' FROM DUAL UNION ALL
SELECT 'D', DATE '2015-11-06', DATE '2015-11-10' FROM DUAL UNION ALL
SELECT 'D', DATE '2016-05-16', DATE '2016-05-17' FROM DUAL UNION ALL
SELECT 'E', DATE '2016-01-01', DATE '2016-01-02' FROM DUAL UNION ALL
SELECT 'E', DATE '2016-01-05', DATE '2016-01-06' FROM DUAL UNION ALL
SELECT 'E', DATE '2016-01-03', DATE '2016-01-07' FROM DUAL;

Query:

WITH times ( account_num, dt, lvl ) AS (
  SELECT Account_num, start_dt - 1,  1 FROM accounts
UNION ALL
  SELECT Account_num, end_dt,       -1 FROM accounts
)
, totals ( account_num, dt, total ) AS (
  SELECT account_num,
         dt,
         SUM( lvl ) OVER ( PARTITION BY Account_num ORDER BY dt, lvl DESC )
  FROM   times
)
SELECT Account_num,
       CASE WHEN COUNT( CASE total WHEN 0 THEN 1 END ) > 1
            THEN 'N'
            ELSE 'Y'
            END AS is_contiguous
FROM   totals
GROUP BY Account_Num
ORDER BY Account_Num;

Output:

ACCOUNT_NUM IS_CONTIGUOUS
----------- -------------
A           Y
B           Y
C           N
D           N
E           Y

Alternative Query:

(It's exactly the same method just using UNPIVOT rather than UNION ALL.)

SELECT Account_num,
       CASE WHEN COUNT( CASE total WHEN 0 THEN 1 END ) > 1
            THEN 'N'
            ELSE 'Y'
            END AS is_contiguous
FROM   (
  SELECT Account_num,
         SUM( lvl ) OVER ( PARTITION BY Account_Num
                           ORDER BY CASE lvl WHEN 1 THEN dt - 1 ELSE dt END,
                                    lvl DESC
                         ) AS total
  FROM   accounts
  UNPIVOT ( dt FOR lvl IN ( start_dt AS 1, end_dt AS -1 ) )
)
GROUP BY Account_Num
ORDER BY Account_Num;

Upvotes: 2

Matt
Matt

Reputation: 14341

WITH cte AS (
    SELECT
       AccountNumber
       ,CASE
          WHEN
             LAG(End_Dt) OVER (PARTITION BY AccountNumber ORDER BY End_Dt) IS NULL THEN 0
          WHEN
             LAG(End_Dt) OVER (PARTITION BY AccountNumber ORDER BY End_Dt) >= Start_Dt - 1 THEN 0
          ELSE 1
       END as discontiguous 
    FROM
       #Table
)

SELECT
    AccountNumber
    ,CASE WHEN SUM(discontiguous) > 0 THEN 'discontiguous' ELSE 'contiguous' END
FROM
    cte
GROUP BY
    AccountNumber;

One of your problems is that your contiguous desired result also includes overlapping date ranges in your example data set. Example A111326 Starts on 3/2/2016 but ends the row before on 3/5/2015 meaning it overlaps by 3 days.

Upvotes: 1

Related Questions