Reputation: 71
Given an array of N elements find all the subsets of array with sum equal to the target value.
I have seen all the old questions available on this site related to subset sum but none of them worked for me.
My Code is working fine for small inputs but it is taking very much time for N >150.
Is there any other efficient algorithm for doing this.
Please tell me how to optimize this code for larger inputs.
And here is my code
from collections import deque
class Pair:
def __init__(self, i, j, path_set):
self.i = i
self.j = j
self.path_set = path_set
def get_subsets(arr, n, value):
"""
This function appends all the possible paths in result list for the given target sum
Arguments:
arr = A list of numbers
n = number of elements in that list arr
value = Target sum for which we want to generate table
"""
# return immediately if there is no possible subset in arr whose sum is equal to value
if dp[n][value] == False:
return
queue = deque()
queue.append(Pair(n, value, set()))
while len(queue) > 0:
pair = queue.popleft()
if pair.i == 0 or pair.j == 0:
result.append(pair.path_set)
else:
exclude = dp[pair.i - 1][pair.j]
if exclude:
queue.append(Pair(pair.i-1, pair.j, pair.path_set))
if pair.j >= arr[pair.i-1]:
include = dp[pair.i - 1][pair.j - arr[pair.i -1]]
if include:
b = pair.path_set.copy()
b.add(pair.i - 1)
queue.append(Pair(pair.i - 1, pair.j-arr[pair.i-1], b))
def make_dp(arr, n, value):
"""
This function makes a table of target sum equal to the value
Arguments:
arr = A list of numbers
n = number of elements in that list arr
value = Target sum for which we want to generate table
Returns:
dp = A 2D boolean table
"""
dp = [[False for i in range(value+1)] for j in range(n+1)]
for i in range(n+1):
for j in range(value+1):
if j ==0:
dp[i][j] = True
elif i == 0:
dp[i][j] = False
else:
if dp[i-1][j]:
dp[i][j] = True
elif j >=arr[i-1]:
if dp[i-1][j-arr[i-1]]:
dp[i][j] = True
return dp
if __name__ == '__main__':
n = int(input())
arr = list(map(int, input().split()))
value = int(input())
dp = make_dp(arr, n, value)
result = []
get_subsets(arr, n, value)
print(result)
The input for which it is taking very much time:
200
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200
200
Please optimize this code or tell me any other approach for doing the same. Thanks in advance.
Upvotes: 1
Views: 1495
Reputation: 1
class SubSet_Sum:
def subset_sum(self, arr,target,res=[],temp= []):
if sum(arr)==target:
if not sorted(arr) in res:
res.append(sorted(arr))
else:
for i in range(len(arr)):
temp = arr[:i]+arr[i+1:]
self.subset_sum(temp, target,res)
return res
Upvotes: 0
Reputation: 42143
You can get this in O(n) time by creating a dictionary of cumulative sums that point to their respective index. When there exists a sum s+T
for a sum s
in the dictionary, you have a range that adds up to T
:
from itertools import accumulate
A = list(range(1,201))
T = 200
sums = {s:i for i,s in enumerate(accumulate(A)) }
result = [ [*range(i+1,sums[s+T]+1)] for s,i in sums.items() if s+T in sums ]
print(result)
# [[4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
# [37, 38, 39, 40, 41],
# [199]]
Even for 1 million values in the list, this takes less than a second.
Note that this assumes that all elements in the array are > 0.
It would possible to support zero and negative values with just a few alterations:
from itertools import accumulate
A = [*range(-10,11)]
T = 20
sums = dict()
for i,s in enumerate(accumulate(A)):
sums.setdefault(s,[]).append(i)
result = []
for cum,starts in sums.items():
if cum+T not in sums: continue
result.extend( [*range(s+1,e+1)] for s in starts
for e in sums[cum+T] if s<e )
print(A)
# [-10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
print(result)
# [[9, 10, 11, 12, 13, 14, 15, 16], [12, 13, 14, 15, 16]]
This takes 2-3 seconds on a list with 1 million values but could be longer depending on the size of the result.
Upvotes: 1
Reputation: 953
You may find that using itertools and combinations is a bit more efficient. The code is also much simpler.
from itertools import chain, combinations
li = [1,2,3,4,5,6]
s=12
itr=chain.from_iterable(combinations(li, n) for n in range(len(li)+1))
result = [el for el in itr if sum(el)==s]
print(result)
Output:
[(1, 5, 6), (2, 4, 6), (3, 4, 5), (1, 2, 3, 6), (1, 2, 4, 5)]
Upvotes: 0