Reputation: 80
I'm trying to optimize a solution that times out on a competitive programming website. I started using cProfile and it appears to show bisect_right()
requiring 4x as much time as insort_right()
. This is a profiling run on input list with over 40k entries:
89936 function calls in 3.596 seconds
Ordered by: standard name
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 3.596 3.596 <string>:1(<module>)
1 0.180 0.180 3.596 3.596 leetcode.py:38(go)
1 3.357 3.357 3.415 3.415 leetcode.py:6(numSubarrayProductLessThanK)
44965 0.045 0.000 0.045 0.000 {built-in method _bisect.bisect_right}
44965 0.014 0.000 0.014 0.000 {built-in method _bisect.insort_right}
1 0.000 0.000 3.596 3.596 {built-in method builtins.exec}
1 0.000 0.000 0.000 0.000 {built-in method builtins.print}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
I thought all list insertions were O(n) since on average n/2 elements must be moved. And simply determining the location of insertion into sorted list would be O(log n). But in the profile report it looks flipped: the inserter insort_right()
is slower than the location determiner bisect_right()
. Where am I going wrong? Here is the code:
from bisect import insort, bisect
class Solution:
def numSubarrayProductLessThanK(self, nums, k):
if k == 0:
return 0
result = 0
product = 1
products = [1]
products_n = 1
for num in nums:
product *= num
factor_min = product // k
n_factors_less = products_n - bisect(products, factor_min)
result += n_factors_less
insort(products, product)
products_n += 1
return result
Thank you for looking.
Upvotes: 0
Views: 243
Reputation: 281207
Your bisect
and insort
calls pass completely different arguments.
Assuming nums
is a list of positive integers, your insort
calls will always insert the new product
at the end of the products
list, which takes O(1) rather than O(n) time for insertion, and which is highly amenable to branch prediction for the binary search. (Python object comparisons are more complex than a simple hardware comparison instruction, but branch prediction still applies to the logic within the comparison hooks, particularly since int
comparisons are implemented in C.)
Meanwhile, assuming k
is substantially greater than typical elements of nums
, your bisect
calls will find a spot for factor_min
somewhere in the middle of the list, likely near but not at the end most of the time. This is much less amenable to branch prediction.
Without a more sophisticated test, I cannot be sure that branch prediction is the dominating factor, but it seems likely.
Upvotes: 2