Vlad Riabets
Vlad Riabets

Reputation: 18

How to optimise this code to work faster? Guess it is written algorithmically not perfect

I have a dictionary as on input. For each key in dict the value is a list of integers. Every integer in this list is a records of time in seconds that shows when some action occured. My goal is to find for each key the maximum number of actions that took place during 1 minute.

I have written code that solves this problem but it is doing it too slowly. It did not pass the test (not my test, I don't know what inputs it contained). I have polished my code iterationally but it is still too slowly, so I think I am using bad algorithm. Can you help me to find a good one if it exists? You can use standard library only!

trades_for_exc = {}

# largest_for_exc is an output dict
largest_for_exc = {}

# seconds_for_exc is an unput dict
for key, value in seconds_for_exc.items():
    val_len = len(value)
    largest = 0

    # iterate through all values in some list
    for i in range(0, val_len):
        new_largest = 1
        i_in_seconds = value[i]

        # iterate through all values starting from current to compare time
        for trades in value[i+1:]:
            if trades - i_in_seconds < 60.0:
                new_largest += 1
            else:
                break

        if new_largest > largest:
            largest = new_largest

    largest_for_exc[key] = largest

Sample input: seconds_for_exc = {"A": [34201.034, 34255.000, 34255.554, 34261.034], "B": [34255.556, 34261.033]}

UPDATE The right code should look like this:

largest_for_exc = {}

for key, value in seconds_for_exc.items():
    val_len = len(value)
    i = 0
    j = 1
    largest = 0
    while j < val_len:
        if (value[j] - value[i]) < 60:
            new_largest = j - i + 1
            if new_largest > largest:
                largest = new_largest
            j += 1
        else:
            i += 1
    largest_for_exc[key] = largest

Upvotes: 0

Views: 98

Answers (3)

Adri
Adri

Reputation: 599

I invite you to use line_profiler (https://github.com/rkern/line_profiler) to see which line is the most time consuming.

It is generally not advisable to concatenate many loops together as it increases the number of operations that are sometimes redundant.

One general advice would be to convert your lists of integers (that are contained in your dictionary) in np.array. Numpy data structures have a need for speed and are faster than lists.

It will help you to vectorize your loops and improve the performance of your code. Check this link: https://hackernoon.com/speeding-up-your-code-2-vectorizing-the-loops-with-numpy-e380e939bed3

Upvotes: 1

hager
hager

Reputation: 313

The runtime of your algorithm is quadratic in the number of items in each list, so it will be slow whenever the lists get large.

Instead you can use this approach, which has linear runtime: Use two indices, one for the start of your interval, one for the end. Advance the second index and record the largest span until time difference is >= 60. Then advance the start index until the difference is again < 60.

Repeat until you reach the end.

Upvotes: 0

Itamar Mushkin
Itamar Mushkin

Reputation: 2905

Here's a solution, I hope it'll be faster. The main idea is to use the times as indices and not as values, and using pandas .rolling() method to count the events each minute (where there are events), and then take the maximum:

import pandas as pd
trades_for_exc = {'a': [155, 200, 300], 'b':[10, 80, 120, 130], 'c': [0]}

max_actions_in_one_minute = {k: 
                             pd.Series(
                                 index=pd.to_datetime(v,unit='s'),
                                 data=1
                             ).rolling('60s').count().max()
                             for k,v in trades_for_exc.items()}

[Edited after addition that only standard library is to be used]: This can be done with list comprehension - see if it's any faster:

max_actions_in_one_minute = {k: max([len([y for y in v if y>=x and y<=x+60]) for x in v]) 
                             for k, v in trades_for_exc.items()}

Upvotes: 0

Related Questions