Reputation: 10083
Given an array of integers and an integer value K my task is to write a function that prints to the standard output the highest number for that value in the array and the past K entries before it.
Example Input:
tps: 6, 9, 4, 7, 4, 1
k: 3
Example Output:
6
9
9
9
7
7
I have been told that the code I have written could be made much more efficient for large data sets. How can I make this code most efficient?
def tweets_per_second(tps, k):
past = [tps[0]]
for t in tps[1:]:
past.append(t)
if len(past) > k: past = past[-k:]
print max(past)
Upvotes: 3
Views: 283
Reputation: 18556
You can achieve a linear time complexity using a monotonic queue(O(n) for any value of k). The idea is the following:
Let's maintain a deque of pairs (value, position). Initially, it is empty.
When a new element arrives, do the following: while the position of the front element is out of range(less than i - K), pop it. While the value of the back element is less than the new one, pop it. Finally, push a pair (current element, its position) to the back of the deque.
The answer for the current position is the front element of the deque.
Each element is added to the deque only once and removed at most once. Thus, the time complexity is linear and it does not depend on K. This solution is optimal because just reading the input is O(n).
Upvotes: 6
Reputation: 153
Try using a heap to achieve reducing the complexity of the max operation in from O(K)
to O(logK)
time.
(-tps[i])
*, i in range(0,k)
and output (-heap[0])
each timetps[i]
remove tps[i-k]
, and print (-heap[0])
Overall you get a O(N log(K)) algorithm, while what you use now is O(N*K). This will be very helpful if K is not small.
*Since the implementation of heap has the min(heap) in heap[0] as an invariant, if you add -value
the -heap[0]
will be the max(heap)
as you want it.
Upvotes: 3
Reputation: 9946
pandas can do this pretty well:
import pandas as pd
df = pd.DataFrame(dict(data=[6, 9, 4, 7, 4, 1]))
df['running_max'] = pd.expanding_max(df.data)
df['rolling_max'] = pd.rolling_max(df.data, 3, min_periods=0)
print df
data running_max rolling_max
0 6 6 6
1 9 9 9
2 4 9 9
3 7 9 9
4 4 9 7
5 1 9 7
Upvotes: 0