AT6996
AT6996

Reputation: 39

Efficient algorithm to match two arrays with same values but different order

We know two arrays A, B of size n have same set of distinct integer values, but maybe ordered in different ways, so there is one-to-one match between the two arrays. We don't know the actual values stored in the arrays, and no way to access values of the array thus can't use function such as sorted. However we may ask question about any pair of A[i] and B[j], and get answer of whether A[i] = B[j], A[i] > B[j] or A[i] < B[j].

What's an efficient algorithm to match each element A to its corresponding element in B? First thing came to my mind is a simple brute force approach, first match A[1] with its counterpart in B by repeated asking relationship question between A[1] and B[j] for j = 1 to n until we find the match. Then do the same for A[i] for i = 2 to n to find match for rest of the A[i]s. The worst case running time would be n + (n - 1) + ... + 1 = O(n^2). But seems this would not be the most efficient way to do it, because we only used the information of whether A[i] = B[j], but not the more detailed information of whether A[i] > B[j] or A[i] < B[j]. Intuitively there should be a way to utilize the relative order between A[i] and B[j] to design an algorithm in less than O(n^2) time.

Any help would be greatly appreciated!

Upvotes: 1

Views: 2061

Answers (4)

superb rain
superb rain

Reputation: 5521

Here's a Python implementation of Matt's algorithm.

  • AB creates and hides A and B, only offering to compare A[i] and B[j] and checking whether index lists I and J are a perfect match.
  • quickmatch is the solution algorithm

Demo outupt showing the computed I, J and whether they're a perfect match:

[16, 0, 19, 7, 10, 15, 12, 14, 5, 4, 11, 18, 1, 9, 2, 6, 13, 8, 3, 17]
[1, 3, 13, 12, 0, 18, 9, 4, 14, 17, 11, 6, 5, 7, 15, 2, 8, 10, 16, 19]
True

Code:

from random import shuffle

class AB:
    def __init__(self, n):
        A = list(range(n))
        B = list(range(n))
        shuffle(A)
        shuffle(B)
        def cmp(i, j):
            a = A[i]
            b = B[j]
            return -1 if a < b else 1 if a > b else 0
        def check(I, J):
            if not (sorted(I) == sorted(J) == list(range(n))):
                return False
            return all(A[i] == B[j] for i, j in zip(I, J))
        self.cmp = cmp
        self.check = check

def quickmatch(I, J):
    if not I:
        return I, J

    ipivot = I[0]
    jpivot = next(j for j in J if ab.cmp(ipivot, j) == 0)

    small_I = [i for i in I if ab.cmp(i, jpivot) < 0]
    small_J = [j for j in J if ab.cmp(ipivot, j) > 0]
    small_I, small_J = quickmatch(small_I, small_J)

    large_I = [i for i in I if ab.cmp(i, jpivot) > 0]
    large_J = [j for j in J if ab.cmp(ipivot, j) < 0]
    large_I, large_J = quickmatch(large_I, large_J)

    return (small_I + [ipivot] + large_I,
            small_J + [jpivot] + large_J)

# Create a test case
n = 20
ab = AB(n)

# Solve
I = list(range(n))
J = list(range(n))
I, J = quickmatch(I, J)

# Check
print(I)
print(J)
print(ab.check(I, J))

Upvotes: 1

Matt Timmermans
Matt Timmermans

Reputation: 59194

Since you can't compare elements in the same array, you can't sort them individually, but you can do a sort of mutual quicksort, still in expected O(N log N) time:

  1. Pick an element of A as the A-pivot
  2. Partition the elements of B according to how they compare to the A-pivot. One of them will compare equal. That is the B-pivot
  3. Partition the elements of A according to how they compare the the B-pivot. The partitions will be the same sizes as B's partitions if there's a 1-1 matching.
  4. Recurse on the lower A and B partitions, and the upper A and B partitions.

Upvotes: 4

hivert
hivert

Reputation: 10667

The OP question is not so clear. First he says that the entries are integers but then he says that he can only compare them 2 by 2, which suggest that he cannot hash them. Here is a solution if hashing is not possible:

I would sort both array keeping the position of the elements along the elements. You can then check if the sorted values are equal and find the matching thanks to the positions. It is O(n log(n))

val1 = ["1", "5", "3", "6"]
val2 = ["3", "6", "1", "5"]

//  Key = ... means to compare i and j, compare val1[i] with val2[j]
val1s = sorted(range(len(val1)), key = lambda i: val1[i])
// The result is
val1s
[0, 2, 1, 3]
// which is effectively the order in which val1 must be read to have it sorted...

val2s = sorted(range(len(val2)), key = lambda i: val2[i])

perm = [None]*len(val1)
for i in range(len(val1)):
    perm[val1s[i]] = val2s[i]
    
perm

[2, 3, 0, 1]

Note : the code works verbatim with anything with is comparable (eg: strings).

Edit : I'm answering to the comment "you can't sort the array, since you don't know the array elements". This is wrong in some sense. Instead of sorting the array (I'm not allowed to change it), I return a list which give the correct order to read the array to be sorted. It is perfectly possible to get that without accessing to the element. See the python code.

Upvotes: 1

notnotparas
notnotparas

Reputation: 177

Here's my attempt on this problem:

from collections import defaultdict
a = [1,2,4,5,8]
b = [8,5,4,1,2]
d = defaultdict(list)   #(element,index) map

for i,v in enumerate(a):
    d[v]=[i]
for j,x in enumerate(b):
    d[x].append(j)
print(d)

Pretty straightforward.I'm creating a element to index map for one array, and using it match the keys of second array.
Time Complexity: O(n) Space Complexity: O(n)
Trying to find a O(1) space complexity solution, which intuitively could be done using divide and conquer approach, making use of that specified query we can ask for each element. Will update soon.

Upvotes: 1

Related Questions