Reputation: 2109
I'm studying dynamic programming by doing Leetcode problems, and I frequently face time limit exceeded errors even though I'm caching my results. Can anyone explain why my version is so much slower than the official version for this problem?
There are obviously differences in the code, e.g., I use a class function for recursion while the official answer does not. My recursive function returns numeric values, the official one does not, etc. None of these seem like meaningful differences though, but the performance difference is nonetheless dramatic.
My version. This takes 0.177669
seconds to run, and receives a time limit exceeded error.
import datetime as dt
from typing import List
from functools import lru_cache
class Solution:
def canPartition(self, nums: List[int]) -> bool:
self.nums = nums
total = sum(self.nums)
if total % 2 == 1:
return False
half_total = total // 2
return self.traverse(half_total, 0) == 0
@lru_cache(maxsize=None)
def traverse(self, subset_sum, index):
if subset_sum < 0:
return float('inf')
elif index == len(self.nums):
return subset_sum
else:
include = self.traverse(subset_sum - self.nums[index], index + 1)
exclude = self.traverse(subset_sum, index + 1)
best = min(include, exclude)
return best
test_case = [20,68,68,11,48,18,50,5,3,51,52,11,13,11,38,100,30,87,1,56,85,63,14,96,7,17,54,11,32,61,94,13,85,10,78,57,69,92,66,28,70,20,3,29,10,73,89,86,28,48,69,54,87,11,91,32,59,4,88,20,81,100,29,75,79,82,6,74,66,30,9,6,83,54,54,53,80,94,64,77,22,7,22,26,12,31,23,26,65,65,35,36,34,1,12,44,22,73,59,99]
solution = Solution()
start = dt.datetime.now()
print(solution.canPartition(test_case))
end = dt.datetime.now()
print((end-start).total_seconds())
This is the official answer. It takes only 0.000165
seconds!
import datetime as dt
from typing import List, Tuple
from functools import lru_cache
class Solution:
def canPartition(self, nums: List[int]) -> bool:
@lru_cache(maxsize=None)
def dfs(nums: Tuple[int], n: int, subset_sum: int) -> bool:
# Base cases
if subset_sum == 0:
return True
if n == 0 or subset_sum < 0:
return False
result = (dfs(nums, n - 1, subset_sum - nums[n - 1])
or dfs(nums, n - 1, subset_sum))
return result
# find sum of array elements
total_sum = sum(nums)
# if total_sum is odd, it cannot be partitioned into equal sum subsets
if total_sum % 2 != 0:
return False
subset_sum = total_sum // 2
n = len(nums)
return dfs(tuple(nums), n - 1, subset_sum)
test_case = [20,68,68,11,48,18,50,5,3,51,52,11,13,11,38,100,30,87,1,56,85,63,14,96,7,17,54,11,32,61,94,13,85,10,78,57,69,92,66,28,70,20,3,29,10,73,89,86,28,48,69,54,87,11,91,32,59,4,88,20,81,100,29,75,79,82,6,74,66,30,9,6,83,54,54,53,80,94,64,77,22,7,22,26,12,31,23,26,65,65,35,36,34,1,12,44,22,73,59,99]
solution = Solution()
start = dt.datetime.now()
print(solution.canPartition(test_case))
end = dt.datetime.now()
print((end-start).total_seconds())
Upvotes: 0
Views: 124
Reputation: 3014
min
on two recursively gained result where you have to run two recursion to their deepest levels. by using booleans, you can shortcut this process and this other program uses that shortcut.Upvotes: 1
Reputation: 146
In the former version, all possible cases are searched. While in the latter, the algorithm stops when a feasible solution has been found.
In the first version:
include = self.traverse(subset_sum - self.nums[index], index + 1)
# Suppose {include} is zero, the answer is already obtained,
# but the algorithm still try to compute {exclude}, which is not neccessary.
exclude = self.traverse(subset_sum, index + 1)
In the second version:
result = (dfs(nums, n - 1, subset_sum - nums[n - 1])
or dfs(nums, n - 1, subset_sum))
# Because of the short-circuit behavior of logical operator,
# if the first branch has already obtained the solution,
# the second branch will not be executed.
Just adding a if-check will improve the performance:
include = self.traverse(subset_sum - self.nums[index], index + 1)
# Check whether we are already done:
if include == 0:
return include
exclude = self.traverse(subset_sum, index + 1)
Upvotes: 1
Reputation: 43505
It you want to know about performance, you need to profile your code. Profiling lets you see where your code spends its time.
CPython comes with built-in profiling module called cProfile
.
But you might want to look at e.g. line_profiler.
Upvotes: 1