Reputation: 35
The problem:
I need to find three numbers in array2 that add up or comes as close as possible to each number in array3(it has to be three numbers).
Print the corresponding index from list1 of each number used in array2
Can only use each number in array2 only twice.
The data: I have three sets of data in one list and two arrays. First list is the names, second array is number, and third array is targets. list1 and array2 are the same length(55) but not array3.
list1 = ['ab', 'ac', 'ad', 'ae', 'af', 'ag', 'ah', 'ai', 'aj', 'ak',
'bc', 'bd', 'be', 'bf', 'bg', 'bh', 'bi', 'bj', 'bk', 'cd', 'ce',
'cf', 'cg', 'ch', 'ci', 'cj', 'ck', 'de', 'df', 'dg', 'dh', 'di',
'dj', 'dk', 'ef', 'eg', 'eh', 'ei', 'ej', 'ek', 'fg', 'fh', 'fi',
'fj', 'fk', 'gh', 'gi', 'gj', 'gk', 'hi', 'hj', 'hk', 'ij', 'ik',
'jk']
array2 = [39, 6, 29, 38, 2, 34, 7, 6, 2, 3, 37, 13, 20, 18, 4, 14,
28, 2, 20, 25, 13, 38, 32, 28, 9, 7, 14, 11, 31, 29, 29, 39, 9, 35,
14, 34, 23, 31, 11, 2, 37, 19, 18, 6, 5, 12, 6, 33, 30, 22, 38, 37,
13, 31, 40]
array3 = [80, 74, 84, 89, 89, 78, 79, 85, 81, 89, 75, 86, 76, 71,
82, 79, 75, 78, 83, 89]
The results I'm looking for are:
For 80 in array3, use 39+38+3, which would be 'ab', 'ae', 'ak' from list1.
For 74 in array3, use 39+32+2, which would be 'ab', 'cg', 'ek' from list1
and so on.
Im trying to find a pythonic way of solving this, using python3.x. I have looked into itertools.combinations/permutations and the knapsack problem. The Knapsack problem has come the closest to solving this but evaluates two values to get the best solution against a target and I only need one. I'm not looking for someone to write the code for me(if you want to I won't stop you), rather I'm looking for someone with more experience then me to point me in the right direction for solving this problem.
Upvotes: 3
Views: 1106
Reputation: 694
The following algorithm searches e solution in the huge space of all triplets in array2
for all targets in array3
:
list1 = ['ab', 'ac', 'ad', 'ae', 'af', 'ag', 'ah', 'ai', 'aj', 'ak', 'bc', 'bd', 'be', 'bf', 'bg', 'bh', 'bi', 'bj', 'bk', 'cd', 'ce', 'cf', 'cg', 'ch', 'ci', 'cj', 'ck', 'de', 'df', 'dg', 'dh', 'di', 'dj', 'dk', 'ef', 'eg', 'eh', 'ei', 'ej', 'ek', 'fg', 'fh', 'fi', 'fj', 'fk', 'gh', 'gi', 'gj', 'gk', 'hi', 'hj', 'hk', 'ij', 'ik', 'jk']
array2 = [39, 6, 29, 38, 2, 34, 7, 6, 2, 3, 37, 13, 20, 18, 4, 14, 28, 2, 20, 25, 13, 38, 32, 28, 9, 7, 14, 11, 31, 29, 29, 39, 9, 35, 14, 34, 23, 31, 11, 2, 37, 19, 18, 6, 5, 12, 6, 33, 30, 22, 38, 37, 13, 31, 40]
array3 = [80, 74, 84, 89, 89, 78, 79, 85, 81, 89, 75, 86, 76, 71, 82, 79, 75, 78, 83, 89]
import itertools
import numpy as np
import heapq
import copy
list1 = np.array(list1, dtype=str)
array2 = np.array(array2, dtype=int)
array3 = np.array(array3, dtype=int)
m, n = len(array2), len(array3)
combs = [[] for __ in range(n)]
maxuses = 2
combinations = set(map(tuple, itertools.combinations(list(range(m))*maxuses, 3)))
print(f'searching in {len(combinations)}! space')
def dist(a, b):
return abs(a - b)
for i, target in enumerate(array3):
for comb in map(list, combinations):
combs[i].append((dist(target, sum(array2[comb])), comb))
combs[i].sort(key=lambda item: item[0])
tested = set()
cost = 0
locs = [0]*n
used = {i: [] for i in range(m)}
for i in range(n):
for value in combs[i][0][1]:
used[value].append(i)
cost += combs[i][0][0]
def priority(values):
return (np.array(list(map(len, values)))**2).sum()
minheap = [(cost, priority(used.values()), locs, used)]
count = 0
while minheap:
cost, __, locs, used = heapq.heappop(minheap)
count += 1
print(f'tested {count}, best cost {cost}, heap size {len(minheap)}')
for key in used:
if len(used[key]) > maxuses:
loc1 = used[key][-1]
loc2 = next(itertools.islice(filter(lambda x: x != loc1, used[key]), 0, None))
print(f'value at {key} is used by {len(used[key])} combinations')
# print(key, used[key])
# print(loc1, combs[loc1][locs[loc1]][1])
# print(loc2, combs[loc2][locs[loc2]][1])
for value in combs[loc1][locs[loc1]][1]:
used[value].remove(loc1)
for value in combs[loc2][locs[loc2]][1]:
used[value].remove(loc2)
if loc1 < len(combinations)-1:
cost1 = cost
locs1 = list(locs)
used1 = copy.deepcopy(used)
cost1 -= combs[loc1][locs[loc1]][0]
locs1[loc1] += 1
cost1 += combs[loc1][locs[loc1]][0]
for value in combs[loc1][locs1[loc1]][1]:
used1[value].append(loc1)
for value in combs[loc2][locs1[loc2]][1]:
used1[value].append(loc2)
if tuple(locs1) not in tested:
tested.add(tuple(locs1))
heapq.heappush(minheap, (cost1, priority(used1.values()), locs1, used1))
if loc2 < len(combinations)-1:
cost2 = cost
locs2 = list(locs)
used2 = copy.deepcopy(used)
cost2 -= combs[loc2][locs2[loc2]][0]
locs2[loc2] += 1
cost2 += combs[loc2][locs2[loc2]][0]
for value in combs[loc1][locs2[loc1]][1]:
used2[value].append(loc1)
for value in combs[loc2][locs2[loc2]][1]:
used2[value].append(loc2)
if tuple(locs2) not in tested:
tested.add(tuple(locs2))
heapq.heappush(minheap, (cost2, priority(used2.values()), locs2, used2))
break
else:
print(f'found a solution with {cost} cost:')
print(locs)
for i , target in enumerate(array3):
print(f'{target}\t~=\t ', end='')
print(*array2[combs[i][locs[i]][1]], sep='+', end=' ')
print('\t(', end='')
print(*list1[combs[i][locs[i]][1]], sep=', ', end='')
print(')')
exit()
It will return (one of) the triplets combinations that minimize the cost and only uses each number in array2
at most twice.
Because you didn't specified the criteria for the best solution when there isn't a exact one, I assumed the absolute difference between the sum of a triplet and its target, but you can change that in dist
.
It works incredibly fast with your example (<10s), but I have guarantees it will be as fast as that, and you'll probably need some randomization. But this is one solution for your example:
80 ~= 28+23+29 (ch, eh, dg)
74 ~= 29+39+6 (dg, di, ai)
84 ~= 13+33+38 (ij, gj, hj)
89 ~= 37+39+13 (bc, di, ij)
89 ~= 30+40+19 (gk, jk, fh)
78 ~= 7+40+31 (ah, jk, ei)
79 ~= 31+18+30 (ei, fi, gk)
85 ~= 13+37+35 (ce, fg, dk)
81 ~= 18+32+31 (bf, cg, df)
89 ~= 34+20+35 (eg, be, dk)
75 ~= 13+28+34 (bd, bi, ag)
86 ~= 18+39+29 (bf, ab, dh)
76 ~= 29+38+9 (ad, hj, dj)
71 ~= 14+37+20 (bh, bc, be)
82 ~= 29+20+33 (dh, bk, gj)
79 ~= 14+37+28 (ef, hk, ch)
75 ~= 28+9+38 (bi, ci, ae)
78 ~= 34+38+6 (eg, cf, gi)
83 ~= 29+31+23 (ad, df, eh)
89 ~= 37+38+14 (hk, cf, ef)
Upvotes: 1
Reputation: 223
This assumes each element (with distinct index) in array2 is used only once (you could extend to element repeats), and that you don't care which three elements you use:
# target is the desired number from array3
def triplet_sum(list1, array2, target):
n = len(array2)
a = [(i, array2[i]) for i in range(n)]
a.sort(key=lambda x: x[1])
j = 1
i = j-1
k = j+1
best_sum = sys.maxsize
best_answer = None
while j < n:
while i >= 0 and k < n:
x = a[i][1]
y = a[j][1]
z = a[k][1]
S = x + y + z
candidate = [(x, list1[a[i][0]]), (y, list1[a[j][0]]), (z, list1[a[k][0]])]
if S == target:
return candidate
elif S > target:
i -= 1
else:
k += 1
if abs(target - best_sum) > abs(target - S):
best_sum = S
best_answer = candidate
j += 1
i = j-1
k = j+1
return best_answer
Example output:
# Closest match
triplet_sum(list1, array2, 5)
[(2, 'af'), (2, 'aj'), (2, 'bj')]
# An exact match
triplet_sum(list1, array2, 80)
[(20, 'be'), (20, 'bk'), (40, 'jk')]
I'm just moving my middle choice j
along the sorted list a
, always going left with i
if S
is too high, and to the right with k
if S
is too low. O(n^2) complexity at a glance.
Upvotes: 1