Reputation: 227
I am working with a dictionary where each key contain a list of tuples. It looks like this:
dict1 = {'key1': [(time1, value1), (time2, value2), (time3, value3)],
'key2': [(time4, value4), (time5, value5), (time6, value6)],
'key3': [(time7, value7), (time8, value8), (time9, value9)], ...}
What is wish to do for each key is to find the largest drop in 'valueX' from 'timeX' to 'timeY'.
The tuples are orderes so that
time1 < time2 < time3
And it is (typically) true that
value1 > value2 > value3
Both things are true for all keys.
So looking at the first key, what I wish to do is to calculate
value2 - value1 and value3 - value2
And save the times that the biggest drop occurs. Let's say that
value2 - value1 > value3 - value2
Then I wish to save time1 and time2, since it was between those two time values that the largest drop occured.
I am thinking to use a for-loop like the following:
for key in dict1:
for i in dict1[key]:
But I cannot figure out how to
1) loop through the values, calculate the difference between the present value and the past value, save this and compare it the the largest drop that has been observed
2) to save the times that correspond to the largest drop in 'value'.
I hope you can help me out here. Thanks a lot.
Upvotes: 1
Views: 79
Reputation: 82929
Assuming that the lists are already sorted by time, and you always want to compare consecutive values (and not, e.g. values that have the same time difference in between), you can use the zip(lst, lst[1:])
recipe to iterate consecutive pairs in the list, and use max
with a custom key
function to find the pair with the biggest difference.
def biggest_drop(timeseries):
pairs = zip(timeseries, timeseries[1:])
((t1, v1), (t2, v2)) = max(pairs, key=lambda p: p[0][1] - p[1][1])
return (t1, t2)
dict1 = {'key1': [("time1", 23), ("time2", 22), ("time3", 24)],
'key2': [("time4", 12), ("time5", 9), ("time6", 3)],
'key3': [("time7", 43), ("time8", 50), ("time9", 30)]}
print({k: biggest_drop(v) for k, v in dict1.items()})
# {'key3': ('time8', 'time9'), 'key2': ('time5', 'time6'), 'key1': ('time1', 'time2')}
Or shorter (but not necessarily better):
def biggest_drop(timeseries):
return next(zip(*max(zip(timeseries, timeseries[1:]),
key=lambda p: p[0][1] - p[1][1])))
Also, note that if you are looking for the biggest drop, you have to find the maximum for value1 - value2
instead of value2 - value1
.
Upvotes: 2
Reputation: 71461
For Python3, this problem can be solved in one line using itertools.accumulate
:
from itertools import accumulate
import operator
def get_times(d):
final_data = {a:[(b[0][0], b[1][0]) if list(accumulate([i[-1] for i in b], func = operator.sub))[0] > list(accumulate([i[-1] for i in b], func = operator.sub))[1] else (b[1][0], b[2][0])] for a, b in d.items()}
return final_data
dict1 = {'key1': [(1, 3), (23, 12), (3, 5)],
'key2': [(4, 41), (5, 54), (4, 6)],
'key3': [(7, 17), (8, 18), (9, 19)]}
print(get_times(dict1))
Output:
{'key2': [(4, 5)], 'key3': [(7, 8)], 'key1': [(1, 23)]}
Note that since the variables time1
, value1
, etc were not specified, I used integers for both, although a string value for time variables and an integer value for the value variables is also valid.
Upvotes: 2