Maarten Truyens
Maarten Truyens

Reputation: 141

Merging unsorted collections

Assume I have two collections:

A = [John, Mary, Frank, Isabel, Teresa]

B = [Madison, John, Frank, Isabel, Bob]

The algorithm should produce the following result:

MERGE = [Madison, John, Mary, Frank, Isabel, Teresa, Bob]

(although it's also fine if Teresa and Bob are switched)

In other words, the algorithm should use the existing ordering of the two input collections to create a merged collection. There is a theoretically unlimited amount of possible elements, and there is no "master" list from which the order of the elements can be taken.

For my use case, the input collections will be reasonably small (typically less than 50 items), and the majority of items will be the same between the collections, although that's not guaranteed.

Is this a known type of algorithm? I have been searching for merging algorithms, but most talk about ordered lists and performance optimizations thereof.

----- EDIT ------

A few more examples:

first additional example: A = [John, Mary, Frank, Isabel, Teresa, Robert, Bob, Anna, Tessa, Philip] B = [John, Mary, Robert, Bob, Philip, Nicholas] MERGE = [John, Mary, Frank, Isabel, Teresa, Robert, Bob, Anna, Tessa, Philip, Nicholas]

(so the algorithm should infer that Nicholas should be positioned after Philip, because that's also the case in collection B)

second additional example: A = [John, Mary, Frank, Isabel, Teresa, Robert, Bob, Anna, Tessa, Philip] B = [Betty, John, Bob, Philip, Nicholas, Boris] MERGE = [Betty, John, Mary, Frank, Isabel, Teresa, Robert, Bob, Anna, Tessa, Philip, Nicholas, Boris]

(so the algorithm should infer that Betty should be positioned before John, and Nicholas & Boris after Philip)

Upvotes: 0

Views: 62

Answers (2)

Paul Hankin
Paul Hankin

Reputation: 58319

This is a variant of topological sort, with the relation x < y if x appears before y in either list.

This algorithm produces the merged list whenever that's possible:

  • If the heads of the two lists are the same, add that head to the result, and remove it from both lists.
  • If the head of the either list is not in the other list, add it to the result, and remove it from the list it's in.
  • Otherwise, there's no result that preserves the order of the elements in both lists.

You can make this efficient -- that is, O(n) time -- by keeping a set for each lists for the elements remaining in the list, or building a map for each list that maps elements to their index in the list.

Upvotes: 2

Eagleclaw
Eagleclaw

Reputation: 369

If i understand your problem correctly, you want to merge two arrays while protecting their order and remove the consecutive duplicates.

In order to do that, we can simply loop the bigger array and at every loop we can control the list length, consecutive duplicate etc...

Code:

A = ["John", "Mary", "Frank", "Isabel", "Teresa"]
B = ["Madison", "John", "Frank", "Isabel", "Bob"]

# Determine the bigger array length.
loop_count = len(A)
if len(B) > len(A):
    loop_count = len(B)

# Loop the arrays and append to merged array if not out of index,
# or not consecutive duplicate.
merged = []   
for i in range(0, loop_count):
    if i < len(A):
        if len(merged) == 0 or merged[-1] != A[i]:
            merged.append(A[i])
    if i < len(B):
        if len(merged) == 0 or merged[-1] != B[i]:
            merged.append(B[i])

print(merged)

Output: ['John', 'Madison', 'Mary', 'John', 'Frank', 'Isabel', 'Teresa', 'Bob']

Its not the perfect solution, there is better ways to optimize it but since you said your arrays are reasonably small, it should be fine enough.

Upvotes: 0

Related Questions