Oscar Vasquez
Oscar Vasquez

Reputation: 465

SQL remove colliding part of datetimes in a table according to value of a column

I have a table like

MemberID     MembershipStartDate           MembershipEndDate        type
=============================================================================
123          2010-01-01 10:00:00.000      2012-12-31 23:00:00.000   1
123          2011-01-01 21:00:00.000      2012-12-31 12:00:00.000   2
123          2013-05-01  9:00:00.000      2013-12-31  5:00:00.000   2
123          2014-01-01 14:00:00.000      2014-12-31  2:00:00.000   1
123          2014-01-01 11:00:00.000      2015-03-31  1:00:00.000   2

In which for a given member and type, the times do not collide: for type 1, there will be no start and finish row which will concur with other rows, so if I have member 123 type 1 start 2010-01-01 10:00:00.000 and finish 2012-12-31 23:00:00.000, I cannot have member 123 type 1 start 2010-02-01 10:00:00.000 finish 2013-12-31 23:00:00.000 since the range are colliding (I could however have this for type 2). This is my current table.

What I want to do is remove the collisions of times between different types for the same MemberID, so for memberID 123, if a row for type 2 started at 2013-05-01 9:00:00.000 and finished at 2013-12-31 5:00:00.000, and type 1 started at 2013-10-01 9:00:00.000 and finished at 2014-12-31 5:00:00.000, since the row for type 2 started first (the one that started later is the one trimmed), the one for type 1 would be trimmed to: 2013-12-31 5:00:00.000 till 2014-12-31 5:00:00.000, where as you can see, the new start date for the row is the finish date for the row of type 2.

At the end, the first table will end with

MemberID     MembershipStartDate           MembershipEndDate        type
=============================================================================
123          2010-01-01 10:00:00.000      2012-12-31 23:00:00.000   1
123          2012-12-31 23:00:00.000      2012-12-31 12:00:00.000   2
123          2013-05-01  9:00:00.000      2013-12-31  5:00:00.000   2
123          2014-01-01 14:00:00.000      2014-12-31  2:00:00.000   1
123          2014-12-31  2:00:00.000      2015-03-31  1:00:00.000   2

the times are not necessary in order.

Upvotes: 2

Views: 56

Answers (2)

Oscar Vasquez
Oscar Vasquez

Reputation: 465

See the accepted answer and comments on it to see the main idea and what I changed from it

SELECT t2.id, t2.code, MAX(case when t1.enddate > t2.startdate and t1.startdate < t2.startdate then t1.enddate else t2.startdate end), MAX(t2.enddate)
FROM @temporaryTable2 AS t2
LEFT JOIN @temporaryTable2 AS t1 ON t1.member_id = t2.member_id
      AND t1.Code != t2.Code
      AND t1.id != t2.id

GROUP BY t2.id, t2.code

Upvotes: 0

CSTobey
CSTobey

Reputation: 1511

First, I would recommend adding an auto_incrementing id field to the table, so that referencing each row is easier.

Second, use a self-referential query to find the offending records (and is often my desire, generate update sql).

SELECT CONCAT("UPDATE <table> SET enddate = ", QUOTE(t2.startdate), " WHERE id = ", t1.id, ";") AS stmt
  #, t1.*, t2.* # uncomment this line to see the raw data.
FROM <table> AS t1
JOIN <table> AS t2 ON t1.member_id = t2.member_id
      AND t1.type = t2.type
      AND t1.id != t2.id # this makes sure that you dont connect a record to itself. If you didnt have an autoincrementing key, you would have a nasty OR chain to accomplish this
WHERE t1.enddate > t2.startdate 
  AND t1.startdate < t2.startdate;

If you chose not to use and auto-incrementing pk, then:

AND t1.id != t2.id
#becomes something like:
AND NOT (t1.enddate = t2.enddate AND t1.startdate = t2.startdate)

depending on what the natural key actually is (excluding the parts of it that you are actually joining on).

Upvotes: 2

Related Questions