Reputation: 3076
i need to merge interval objects to get distinct ranges of intervals based on extra parameters. How is the best way to do that?
It's about unambiguous statement whether in a given hour state is true. The returned list must have non-duplicated intervals.
Interval object description:
{
'startDate': datetime.datetime, # start of interval
'endDate': datetime.datetime, # end of interval
'prioritized': bool # if True - it's always important, override no-prioritized intervals
'state': bool # result of interval
}
In the examples below i changed startDate/endDate to strings to make them look better.
Interval list look like:
interval_list = [
{'startDate': '10:00:00', 'endDate': '12:00:00', 'prioritized': False, 'state': False},
{'startDate': '11:00:00', 'endDate': '18:00:00', 'prioritized': True, 'state': True},
{'startDate': '13:00:00', 'endDate': '17:00:00', 'prioritized': False, 'state': False},
{'startDate': '17:00:00', 'endDate': '20:00:00', 'prioritized': False, 'state': True},
{'startDate': '19:30:00', 'endDate': '19:45:00', 'prioritized': True, 'state': False}
]
I am trying to achieve the following:
merge(interval_list)
should return:
[
{'startDate': '10:00:00', 'endDate': '11:00:00', 'state': False},
{'startDate': '11:00:00', 'endDate': '19:30:00', 'state': True},
{'startDate': '19:30:00', 'endDate': '19:45:00', 'state': False},
{'startDate': '19:45:00', 'endDate': '20:00:00', 'state': True},
]
I have following not completed code right now:
def merge_range(ranges: list):
ranges = sorted(ranges, key=lambda x: x['startDate'])
last_interval = dict(ranges[0])
for current_interval in sorted(ranges, key=lambda x: x['startDate']):
if current_interval['startDate'] > last_interval['endDate']:
yield dict(last_interval)
last_interval['startDate'] = current_interval['startDate']
last_interval['endDate'] = current_interval['endDate']
last_interval['prioritized'] = current_interval['prioritized']
last_interval['state'] = current_interval['state']
else:
if current_interval['state'] == last_interval['state']:
last_interval['endDate'] = max(last_interval['endDate'], current_interval['endDate'])
else:
pass # i stopped here
yield dict(last_interval)
And use it by merged_interval_list = list(merge_range(interval_list))
Is it a good way ?
Upvotes: 1
Views: 182
Reputation: 8180
When we deal with time intervals, the main idea is to sort the dates (start and end) along with their status: start
or end
. Here, we need an access to the original interval too, to handle priorities and states.
Let's try with this list:
interval_list = [
{'startDate': '10:00:00', 'endDate': '12:00:00', 'prioritized': False, 'state': False},
{'startDate': '11:00:00', 'endDate': '18:00:00', 'prioritized': True, 'state': True},
{'startDate': '13:00:00', 'endDate': '17:00:00', 'prioritized': False, 'state': False},
{'startDate': '17:00:00', 'endDate': '20:00:00', 'prioritized': False, 'state': True},
{'startDate': '19:30:00', 'endDate': '19:45:00', 'prioritized': True, 'state': False}
]
First, we convert datestrings to dates (as you did):
import datetime
day = '2019-05-10'
def get_datetime(d, t):
return datetime.datetime.strptime(d+" "+t, "%Y-%m-%d %H:%M:%S")
for interval in interval_list:
interval['startDate'] = get_datetime(day, interval['startDate'])
interval['endDate'] = get_datetime(day, interval['endDate'])
Now, we build a new list with the needed information:
L = sorted(
[(interval['startDate'], 1, i) for i, interval in enumerate(interval_list)]
+[(interval['endDate'], -1, i) for i, interval in enumerate(interval_list)]
)
L
is the following list of tuples (date, dir, index)
(dir
: 1 means it's a start date, -1 means it's an end date):
[(datetime.datetime(2019, 5, 10, 10, 0), 1, 0), (datetime.datetime(2019, 5, 10, 11, 0), 1, 1), (datetime.datetime(2019, 5, 10, 12, 0), -1, 0), (datetime.datetime(2019, 5, 10, 13, 0), 1, 2), (datetime.datetime(2019, 5, 10, 17, 0), -1, 2), (datetime.datetime(2019, 5, 10, 17, 0), 1, 3), (datetime.datetime(2019, 5, 10, 18, 0), -1, 1), (datetime.datetime(2019, 5, 10, 19, 30), 1, 4), (datetime.datetime(2019, 5, 10, 19, 45), -1, 4), (datetime.datetime(2019, 5, 10, 20, 0), -1, 3)]
Now we can iterate over L
and keep a track of the current state and priority to yield dates when state is modified according to given priority:
def interval_info(i):
interval = interval_list[i]
return interval['state'], interval['prioritized']
T = []
stack = []
for boundary_date, direction, i in L:
state, prioritized = interval_info(i) # state and priority of the current date
if direction == 1: # start date
if stack:
prev_state, prev_prioritized = interval_info(stack[-1]) # previous infos
if state != prev_state and prioritized >= prev_prioritized: # enter a new state with a greater or equal priority
T.append((boundary_date, state)) # enter in new state
else: # begin of covered area
T.append((boundary_date, state)) # enter in new state
stack.append(i) # add the opened interval
elif direction == -1: # end date
stack.remove(i) # remove the closed interval (i is a *value* in stack)
if stack:
prev_state, prev_prioritized = interval_info(stack[-1])
if state != prev_state and not prev_prioritized: # leave a non priority state
T.append((boundary_date, prev_state)) # re-enter in prev state
else: # end of covered area
T.append((boundary_date, None)) # enter in None state
The value of T
is:
[(datetime.datetime(2019, 5, 10, 10, 0), False), (datetime.datetime(2019, 5, 10, 11, 0), True), (datetime.datetime(2019, 5, 10, 19, 30), False), (datetime.datetime(2019, 5, 10, 19, 45), True), (datetime.datetime(2019, 5, 10, 20, 0), None)]
You can then easily produce the output you wanted. Hope it helps!
EDIT: Bonus: how to convert start dates to time intervals:
>>> import datetime
>>> T = [(datetime.datetime(2019, 5, 10, 10, 0), False), (datetime.datetime(2019, 5, 10, 11, 0), True), (datetime.datetime(2019, 5, 10, 19, 30), False), (datetime.datetime(2019, 5, 10, 19, 45), True), (datetime.datetime(2019, 5, 10, 20, 0), None)]
>>> [{'startDate': s[0], 'endDate': e[0], 'state': s[1]} for s,e in zip(T, T[1:])]
[{'startDate': datetime.datetime(2019, 5, 10, 10, 0), 'endDate': datetime.datetime(2019, 5, 10, 11, 0), 'state': False}, {'startDate': datetime.datetime(2019, 5, 10, 11, 0), 'endDate': datetime.datetime(2019, 5, 10, 19, 30), 'state': True}, {'startDate': datetime.datetime(2019, 5, 10, 19, 30), 'endDate': datetime.datetime(2019, 5, 10, 19, 45), 'state': False}, {'startDate': datetime.datetime(2019, 5, 10, 19, 45), 'endDate': datetime.datetime(2019, 5, 10, 20, 0), 'state': True}]
You just have to zip every start date with the the next one to get intervals.
Upvotes: 0
Reputation: 3076
I got an answer for this question:
For first i separate events to prioritized and non-prioritize lists.
Based on the priority list, I create a negation of the interval on a given day.
Next i set prioritized list as main list and start iterate over non-prioritize list.
import datetime
from pprint import pprint
df = "%Y-%m-%d %H:%M:%S"
ds = "%Y-%m-%d"
events = {}
prioritized_events = {}
events["2019-05-10"] = [{
'startDate': datetime.datetime.strptime("2019-05-10 01:00:00", df),
'endDate': datetime.datetime.strptime("2019-05-10 02:00:00", df),
'state': True
}, {
'startDate': datetime.datetime.strptime("2019-05-10 10:00:00", df),
'endDate': datetime.datetime.strptime("2019-05-10 12:00:00", df),
'state': False
}, {
'startDate': datetime.datetime.strptime("2019-05-10 13:00:00", df),
'endDate': datetime.datetime.strptime("2019-05-10 17:00:00", df),
'state': False
}, {
'startDate': datetime.datetime.strptime("2019-05-10 17:00:00", df),
'endDate': datetime.datetime.strptime("2019-05-10 20:00:00", df),
'state': True
}]
prioritized_events["2019-05-10"] = [{
'startDate': datetime.datetime.strptime("2019-05-10 11:00:00", df),
'endDate': datetime.datetime.strptime("2019-05-10 18:00:00", df),
'state': True
}, {
'startDate': datetime.datetime.strptime("2019-05-10 19:30:00", df),
'endDate': datetime.datetime.strptime("2019-05-10 20:00:00", df),
'state': False
}]
allowed_intervals = []
for event_date in prioritized_events:
minimal_time = datetime.datetime.combine(datetime.datetime.strptime(event_date, ds), datetime.time.min)
maximum_time = datetime.datetime.combine(datetime.datetime.strptime(event_date, ds), datetime.time.max)
for ev in prioritized_events[event_date]:
if ev['startDate'] != minimal_time:
allowed_intervals.append({
'startDate': minimal_time,
'endDate': ev['startDate']
})
minimal_time = ev['endDate']
if prioritized_events[event_date][len(prioritized_events[event_date]) - 1]['endDate'] != maximum_time:
allowed_intervals.append({
'startDate': prioritized_events[event_date][len(prioritized_events[event_date]) - 1]['endDate'],
'endDate': maximum_time
})
for event_date in events:
if event_date not in prioritized_events:
prioritized_events[event_date] = events[event_date]
else:
for ev in events[event_date]:
start = ev['startDate']
end = ev['endDate']
state = ev['state']
done = False
for allowed_interval in allowed_intervals:
if start >= allowed_interval['startDate'] and end <= allowed_interval['endDate']:
prioritized_events[event_date].append({
'startDate': start,
'endDate': end,
'state': state
})
done = True
break
elif allowed_interval['startDate'] <= start < allowed_interval['endDate'] < end:
prioritized_events[event_date].append({
'startDate': start,
'endDate': allowed_interval['endDate'],
'state': state
})
start = allowed_interval['endDate']
elif start < allowed_interval['startDate'] and start < allowed_interval['endDate'] < end:
prioritized_events[event_date].append({
'startDate': allowed_interval['startDate'],
'endDate': allowed_interval['endDate'],
'state': state
})
start = allowed_interval['endDate']
elif start < allowed_interval['startDate'] and start < allowed_interval['endDate'] and allowed_interval['startDate'] < end <= allowed_interval['endDate']:
prioritized_events[event_date].append({
'startDate': allowed_interval['startDate'],
'endDate': end,
'state': state
})
start = end
if done:
continue
prioritized_events[event_date] = sorted(prioritized_events[event_date], key=lambda k: k['startDate'])
And now sorted list:
pprint(prioritized_events["2019-05-10"])
returns:
[
{'startDate': datetime.datetime(2019, 5, 10, 1, 0),
'endDate': datetime.datetime(2019, 5, 10, 2, 0),
'state': True
},
{'startDate': datetime.datetime(2019, 5, 10, 10, 0),
'endDate': datetime.datetime(2019, 5, 10, 11, 0),
'state': False
},
{'startDate': datetime.datetime(2019, 5, 10, 11, 0),
'endDate': datetime.datetime(2019, 5, 10, 18, 0),
'state': True
},
{'startDate': datetime.datetime(2019, 5, 10, 18, 0),
'endDate': datetime.datetime(2019, 5, 10, 19, 30),
'state': True
},
{'startDate': datetime.datetime(2019, 5, 10, 19, 30),
'endDate': datetime.datetime(2019, 5, 10, 20, 0),
'state': False
}
]
Upvotes: 1