Reputation: 41
I have two very long lists and am trying to map every integer in first list to closet integer in other list within a maximum threshold difference. I can do it looping over entire lists but is there a faster way to do it? Output can be like x (of list A ) = y (of list B)
. If a number is equidistant from two integers, any one can be chosen. I tried using -
from itertools import product
sorted(product(arr1, arr2), key=lambda t: abs(t[0]-t[1]))
which is very slow and also not one to one mapping.
Example:
Input
A = [1, 3, 8, 77, 66, 25, 40]
B= [2, 5, 12, 120, 30, 21, 40]
Threshold = 40
Output
1=2 ; 3=5 ; 8=12 ; 77- dropped out as nothing best under threshold; 66=30 (as 40 is closer but has better match in other list ); 25=21; 40=40
Upvotes: 1
Views: 145
Reputation: 82889
You can do this in O(n log n) using binary search. In Python, you can use the bisect
module for this.
import bisect
A = [1, 3, 8, 77, 66, 25, 40]
B = [2, 5, 12, 120, 30, 21, 40]
threshold = 40
B = sorted(B) # has to be sorted for binary search
def closest(a, B):
i = bisect.bisect(B, a)
b = min(B[max(0, i-1):i+1], key=lambda b: abs(a - b))
if abs(a - b) < threshold:
return b
C = {a: closest(a, B) for a in A}
C = {a: b for a, b in C.items() if b is not None}
print(C) # {1: 2, 66: 40, 3: 2, 40: 40, 8: 5, 25: 21, 77: 40}
Now, C
has the closest element form B
for each element from A
, but there might be duplicates (e.g. 40
). To resolve those, we have to do a second pass, sorting the elements from A
by their "closeness" to any element from B
using C
. Then, we can remove
the elements from B
once we assigned them to some element from A
.
D = {}
for a in sorted(C, key = lambda a: abs(C[a] - a)):
b = closest(a, B)
if b:
D[a] = b
B.remove(b)
print(D) # {1: 2, 66: 30, 3: 5, 8: 12, 40: 40, 25: 21}
Upvotes: 1
Reputation: 291
Let's use bisect
library, reference at: https://docs.python.org/3.6/library/bisect.html.
bisect.bisect_left(a, x, lo=0, hi=len(a))
Locate the insertion point for x in a to maintain sorted order.
Example:
from bisect import bisect_left, bisect_right
A = [1, 3, 8, 4, 77, 66, 25, 40]
B = [2, 5, 12, 120, 30, 21, 40]
Threshold = 40
B.sort()
for k in A:
left = bisect_left(B, k)
right = bisect_right(B, k)
s = B[left] if abs(B[left] - k) < abs(B[right] - k) else B[right]
if abs(k - s) <= Threshold:
print(k, s)
Upvotes: 2
Reputation:
If you want you can try:
print(" ".join(["{}={}".format(i[0],i[1]) for i in list(zip(A,B)) if abs(i[0] - i[1]) <= 30]))
output:
1=2 3=5 8=12 25=21 40=40
Upvotes: 0