Daniel Quintero
Daniel Quintero

Reputation: 1

Create row from previous and next rows if date are discontinuous

I need to create a row if Current End date compared to Start date from next row are discontinuous by each Employee Number. The dataframe looks like this:

Employee Number Start Date End Date
001 1999-11-29 2000-03-12
001 2000-03-13 2001-06-30
001 2001-07-01 2002-01-01
002 2000-09-18 2000-10-05
002 2000-10-06 2001-06-30
002 2004-05-01 2005-12-31
002 2008-01-01 2008-11-25

A Continuous flag column needs to identify these discontinuous values:

Employee Number Start Date End Date Continuous Flag Explanation
001 1999-11-29 2000-03-12 Y 2000-03-13 is 1d after 2000-03-12
001 2000-03-13 2001-06-30 Y 2001-07-01 is 1d after 2001-06-30
001 2001-07-01 2002-01-01 NaN missing 2023-01-01 End Date row
002 2000-09-18 2000-10-05 Y 2000-10-06 is 1d after 2000-10-05
002 2000-10-06 2001-06-30 N 2004-05-01 is not 1d after 2001-06-30
002 2004-05-01 2005-12-31 N 2008-01-01 is not 1d after 2005-12-31
002 2008-01-01 2008-11-25 NaN missing 2023-01-01 End Date row

Then, for those rows that are 'N', a row needs to be inserted with the discontinuous dates to make them continuous in between rows. If there is no next row, use '2023-01-01' by default. Here is the expected output:

Employee Number Start Date End Date Continuous Flag
001 1999-11-29 2000-03-12 Y
001 2000-03-13 2001-06-30 Y
001 2001-07-01 2002-01-01 Y
001 2002-01-02 2023-01-01 NaN
002 2000-09-18 2000-10-05 Y
002 2000-10-06 2001-06-30 Y
002 2001-07-01 2004-04-30 Y
002 2004-05-01 2005-12-31 Y
002 2006-01-01 2007-12-31 Y
002 2008-01-01 2008-11-25 Y
002 2008-11-26 2023-01-01 NaN

I tried idx for loop without success

Upvotes: 0

Views: 58

Answers (1)

Rick James
Rick James

Reputation: 142208

Plan A: (Filling in gaps)

  1. Create a table of all possible dates (in the desired range). (This is easy to do on the fly in MariaDB by using a seq_..., but messier in MySQL.)
  2. SELECT ... FROM that-table-of-dates LEFT JOIN your-table ON ...

As for filling in the gaps with values before (or after) the given hole. I don't understand the goals.

Plan B: (Simply discovering gaps)

Do a "self-join" of the table with itself. For this you must have consecutive ids. Since you don't have such, I am not sure what to do.

Then check whether the (end_date + INTERVAL 1 DAY) of one row matches the start_date of the 'next' row.

Plan C: (requires MySQL 8.0 or MariaDB 10.2)

Use LAG() (or `LEAD() windowing functions to compare a value in one row to the previous (or next) row.

This may be the simplest way to set the "continuous flag".

Be sure to check for discontinuity in EmployeeId as well as INTERVAL 1 DAY as mentioned above.

Upvotes: 1

Related Questions