Reputation: 21
I am trying to understand a more efficient solution to Problem 347 on Leetcode. My solution was not as efficient. I do not understand why this solution runs in O(n)
time.
class Solution:
def topKFrequent(self, nums: List[int], k: int) -> List[int]:
# count up the frequencies of each number: # O(n)
counts = {}
for n in nums:
counts[n] = 1 + counts.get(n, 0) # add one to the count, default if not found is 0
# create our buckets: O(n)
freq_buckets = for [[] for i in range(len(nums) + 1)]
# add the counts to each of the buckets
for num, count in counts.items():
# for that specific frequency, add this num to the list of
# numbers occurring at that frequency
freq_buckets[count].append(num)
# hold our k most frequent
most_frequent = []
# iterate from most frequent to least frequent
for i in range(len(freq_buckets) - 1, 0, -1): # O(n): loop 3
for num in freq_buckets[i]: # O(n): loop 4
most_frequent.append(num)
# break when we have k most frequent
if len(most_frequent) == k:
return most_frequent
I believed it ran in O(kn)
time for a moment because loop 3 would only run until k. However, on further inspection, I can see there is a balance between loop 3 and loop 4 because if the outer loop runs for n
iterations then the inner loop could run between 0
and n + 1
times. This leads me to believe that if the inner loop runs n
times then we must be at the bucket of 1
which means we've iterated over the outer loop almost n
times and then iterate over the inner loop k
times (which could be equal to n
if n = k
).
There are cases that I'm struggling to reason through.
Am I correct in thinking that this algorithm does not run in O(n)
time? If not, please explain why.
I have tried analysing the time complexities of each line and expected to find O(n)
.
I have also read another post here but do not understand how they come to their conclusion.
Upvotes: 1
Views: 252
Reputation: 456
Every iteration of the inner loop will append one element to most_frequent
. Once the list most_frequent
reaches size k
, the function will terminate. In addition, most_frequent
does not reset every time the outer loop increments by 1. Thus, the inner loop will run at most k
times, total, regardless of how many times the outer loop is running.
A complexity analysis on this gives O(N+K) -- N for the outer loop, and K for the inner loop. Since K <= N, this is equivalent to O(2N) ~ O(N).
The key idea here is that the inner loop is not running k
times for every one of the outer loop's iterations. The number of iterations of the inner loop across all iterations of the outer loop is bounded by a constant k
.
Upvotes: 0