Reputation: 483
I'm trying to merge overlapping datetime ranges. I have list of datetime ranges as tuples in a list:
data = [(datetime.datetime(2016, 1, 10, 13, 0), datetime.datetime(2016, 1, 10, 16, 0)), (datetime.datetime(2016, 1, 10, 14, 0), datetime.datetime(2016, 1, 10, 14, 0)), (datetime.datetime(2016, 1, 10, 22, 0), datetime.datetime(2016, 1, 10, 22, 0)), (datetime.datetime(2016, 1, 10, 23, 0), datetime.datetime(2016, 1, 11, 0, 30)), (datetime.datetime(2016, 1, 11, 2, 30), datetime.datetime(2016, 1, 11, 3, 30)), (datetime.datetime(2016, 1, 11, 13, 0), datetime.datetime(2016, 1, 11, 16, 0)), (datetime.datetime(2016, 1, 11, 14, 0), datetime.datetime(2016, 1, 11, 14, 0)), (datetime.datetime(2016, 1, 11, 20, 30), datetime.datetime(2016, 1, 11, 21, 30)), (datetime.datetime(2016, 1, 11, 22, 0), datetime.datetime(2016, 1, 11, 22, 0)), (datetime.datetime(2016, 1, 12, 2, 30), datetime.datetime(2016, 1, 12, 3, 30)), (datetime.datetime(2016, 1, 12, 13, 0), datetime.datetime(2016, 1, 12, 16, 0)), (datetime.datetime(2016, 1, 12, 14, 0), datetime.datetime(2016, 1, 12, 14, 0)), (datetime.datetime(2016, 1, 12, 19, 30), datetime.datetime(2016, 1, 12, 20, 30)), (datetime.datetime(2016, 1, 12, 22, 0), datetime.datetime(2016, 1, 12, 22, 0)), (datetime.datetime(2016, 1, 13, 2, 30), datetime.datetime(2016, 1, 13, 3, 30)), (datetime.datetime(2016, 1, 13, 13, 0), datetime.datetime(2016, 1, 13, 16, 0)), (datetime.datetime(2016, 1, 13, 14, 0), datetime.datetime(2016, 1, 13, 14, 0)), (datetime.datetime(2016, 1, 13, 20, 0), datetime.datetime(2016, 1, 13, 21, 0)), (datetime.datetime(2016, 1, 13, 21, 30), datetime.datetime(2016, 1, 13, 22, 0)), (datetime.datetime(2016, 1, 13, 22, 0), datetime.datetime(2016, 1, 13, 22, 0)), (datetime.datetime(2016, 1, 14, 2, 30), datetime.datetime(2016, 1, 14, 3, 30)), (datetime.datetime(2016, 1, 14, 13, 0), datetime.datetime(2016, 1, 14, 16, 0)), (datetime.datetime(2016, 1, 14, 14, 0), datetime.datetime(2016, 1, 14, 14, 0)), (datetime.datetime(2016, 1, 14, 22, 0), datetime.datetime(2016, 1, 14, 22, 0)), (datetime.datetime(2016, 1, 14, 22, 0), datetime.datetime(2016, 1, 14, 23, 0)), (datetime.datetime(2016, 1, 15, 2, 30), datetime.datetime(2016, 1, 15, 3, 30)), (datetime.datetime(2016, 1, 15, 13, 0), datetime.datetime(2016, 1, 15, 16, 0)), (datetime.datetime(2016, 1, 15, 14, 0), datetime.datetime(2016, 1, 15, 14, 0)), (datetime.datetime(2016, 1, 15, 20, 30), datetime.datetime(2016, 1, 15, 22, 0)), (datetime.datetime(2016, 1, 15, 22, 0), datetime.datetime(2016, 1, 15, 22, 0)), (datetime.datetime(2016, 1, 16, 2, 30), datetime.datetime(2016, 1, 16, 3, 30)), (datetime.datetime(2016, 1, 16, 13, 0), datetime.datetime(2016, 1, 16, 16, 0)), (datetime.datetime(2016, 1, 17, 2, 30), datetime.datetime(2016, 1, 17, 3, 30))]
Here's my current code:
import datetime
def merge_date_ranges(data):
result = []
for t1, t2 in ((data[i], data[i+1]) for i in range(len(data)-1)):
if t1[1] >= t2[0]:
result.append((min(t1[0], t2[0]), max(t1[1], t2[1])))
else:
result.append(t1)
If T1 (first datetime range) and T2 (second datetime range) do NOT overlap then I just add T1 to the new list (result). If T1 and T2 DO overlap, then I add the merged tuple to the new list (result).
My problem is what happens after a merge. For example:
T1 = (datetime.datetime(2016, 1, 10, 13, 0), datetime.datetime(2016, 1, 10, 16, 0))
T2 = (datetime.datetime(2016, 1, 10, 14, 0), datetime.datetime(2016, 1, 10, 14, 0))
T1 and T2 are merged and the following is added to my new list:
(datetime.datetime(2016, 1, 10, 13, 0), datetime.datetime(2016, 1, 10, 16, 0))
So now I want my code (in the next iteration of the for loop) to compare the merged tuple (new T1) with the next datetime range in my list:
T1 = (datetime.datetime(2016, 1, 10, 13, 0), datetime.datetime(2016, 1, 10, 16, 0))
T2 = (datetime.datetime(2016, 1, 10, 22, 0), datetime.datetime(2016, 1, 10, 22, 0))
But instead, here's what T1 and T2 look like:
T1 = (datetime.datetime(2016, 1, 10, 14, 0), datetime.datetime(2016, 1, 10, 14, 0))
T2 = (datetime.datetime(2016, 1, 10, 22, 0), datetime.datetime(2016, 1, 10, 22, 0))
And T1 gets added to my new list (which I don't want) because it was already merged previously.
But I just can't get my head around how to do this. It would be easier if I was able to update my original list by replacing T2 with the merged tuple and deleting T1. But as I understand this is not possible or even a good practice.
After a week of pulling my hair out, I'm posting my first question here in the hope that someone can help me get my sanity back. :)
Update Basically I want to end up with a new list where no datetime ranges overlap.
Upvotes: 3
Views: 3515
Reputation: 2913
Edit: given the new information, you should still do a loop, but base it on a boolean value and check for overlaps.
import datetime
def merge_date_ranges(data):
input = data
result = []
overlap = True # we assume there's overlap to begin with
while(overlap):
overlap = False # will remain false unless overlap is found
for t1, t2 in ((input[i], input[i+1]) for i in range(len(input)-1)):
if t1[1] >= t2[0]:
overlap = True # an overlap was found, so loop will continue
result.append((min(t1[0], t2[0]), max(t1[1], t2[1])))
else:
result.append(t1)
if(overlap):
input = result # preparing the next round
return result
Upvotes: 1
Reputation: 5866
I guess this is what you want. Give it a test and comment:
def merge_date_ranges(data):
result = []
t_old = data[0]
for t in data[1:]:
if t_old[1] >= t[0]: #I assume that the data is sorted already
t_old = ((min(t_old[0], t[0]), max(t_old[1], t[1])))
else:
result.append(t_old)
t_old = t
else:
result.append(t_old)
return result
I am assuming that the dates are already sorted.
BTW, I see that the only weird dates are thos that are single days, maybe you should fix your input data instead.
salida = merge_date_ranges(data)
for item in [t for t in data if t not in salida]:
print item
(datetime.datetime(2016, 1, 10, 14, 0), datetime.datetime(2016, 1, 10, 14, 0))
(datetime.datetime(2016, 1, 11, 14, 0), datetime.datetime(2016, 1, 11, 14, 0))
(datetime.datetime(2016, 1, 12, 14, 0), datetime.datetime(2016, 1, 12, 14, 0))
(datetime.datetime(2016, 1, 13, 14, 0), datetime.datetime(2016, 1, 13, 14, 0))
(datetime.datetime(2016, 1, 13, 22, 0), datetime.datetime(2016, 1, 13, 22, 0))
(datetime.datetime(2016, 1, 14, 14, 0), datetime.datetime(2016, 1, 14, 14, 0))
(datetime.datetime(2016, 1, 14, 22, 0), datetime.datetime(2016, 1, 14, 22, 0))
(datetime.datetime(2016, 1, 15, 14, 0), datetime.datetime(2016, 1, 15, 14, 0))
(datetime.datetime(2016, 1, 15, 22, 0), datetime.datetime(2016, 1, 15, 22, 0))
Upvotes: 1