C. Refsgaard
C. Refsgaard

Reputation: 227

Working with lists of tuples python

I am working with a dictionary where each key contain a list of tuples. It looks like this:

dict1 = {'key1': [(time1, value1), (time2, value2), (time3, value3)],
         'key2': [(time4, value4), (time5, value5), (time6, value6)],
         'key3': [(time7, value7), (time8, value8), (time9, value9)], ...}

What is wish to do for each key is to find the largest drop in 'valueX' from 'timeX' to 'timeY'.

The tuples are orderes so that

time1 < time2 < time3 

And it is (typically) true that

value1 > value2 > value3

Both things are true for all keys.

So looking at the first key, what I wish to do is to calculate

value2 - value1 and value3 - value2

And save the times that the biggest drop occurs. Let's say that

value2 - value1 > value3 - value2

Then I wish to save time1 and time2, since it was between those two time values that the largest drop occured.

I am thinking to use a for-loop like the following:

for key in dict1:
    for i in dict1[key]:

But I cannot figure out how to

1) loop through the values, calculate the difference between the present value and the past value, save this and compare it the the largest drop that has been observed

2) to save the times that correspond to the largest drop in 'value'.

I hope you can help me out here. Thanks a lot.

Upvotes: 1

Views: 79

Answers (2)

tobias_k
tobias_k

Reputation: 82929

Assuming that the lists are already sorted by time, and you always want to compare consecutive values (and not, e.g. values that have the same time difference in between), you can use the zip(lst, lst[1:]) recipe to iterate consecutive pairs in the list, and use max with a custom key function to find the pair with the biggest difference.

def biggest_drop(timeseries):
    pairs = zip(timeseries, timeseries[1:])
    ((t1, v1), (t2, v2)) = max(pairs, key=lambda p: p[0][1] - p[1][1])
    return (t1, t2)

dict1 = {'key1': [("time1", 23), ("time2", 22), ("time3", 24)],
         'key2': [("time4", 12), ("time5", 9), ("time6", 3)],
         'key3': [("time7", 43), ("time8", 50), ("time9", 30)]}
print({k: biggest_drop(v) for k, v in dict1.items()})
# {'key3': ('time8', 'time9'), 'key2': ('time5', 'time6'), 'key1': ('time1', 'time2')}

Or shorter (but not necessarily better):

def biggest_drop(timeseries):
    return next(zip(*max(zip(timeseries, timeseries[1:]), 
                         key=lambda p: p[0][1] - p[1][1])))

Also, note that if you are looking for the biggest drop, you have to find the maximum for value1 - value2 instead of value2 - value1.

Upvotes: 2

Ajax1234
Ajax1234

Reputation: 71461

For Python3, this problem can be solved in one line using itertools.accumulate:

from itertools import accumulate
import operator
def get_times(d):
    final_data = {a:[(b[0][0], b[1][0]) if list(accumulate([i[-1] for i in b], func = operator.sub))[0] > list(accumulate([i[-1] for i in b], func = operator.sub))[1] else (b[1][0], b[2][0])] for a, b in d.items()}
    return final_data

dict1 = {'key1': [(1, 3), (23, 12), (3, 5)],
 'key2': [(4, 41), (5, 54), (4, 6)],
 'key3': [(7, 17), (8, 18), (9, 19)]}
print(get_times(dict1))

Output:

{'key2': [(4, 5)], 'key3': [(7, 8)], 'key1': [(1, 23)]}

Note that since the variables time1, value1, etc were not specified, I used integers for both, although a string value for time variables and an integer value for the value variables is also valid.

Upvotes: 2

Related Questions