Pairing weighted items from two ordered lists while maintaining the order

Question

I am searching for an algorithm that solves this problem:

I have two lists with items for example:

A = {a, b, c, d}
B = {e, f, g}

and a table telling me how good each item from one list matches an item from the other list. For example:

	e	f	g
a	10	2	1
b	1	0	2
c	2	0	0
d	1	20	0

Now I need to pair items of A with items from B so that the the sum of matches is maximised, while keeping both lists in order and only using each item once. Not all items need to be matched.

For this example the solution would be:

[{a, e}, {d, f}]

Anatolii · Accepted Answer

Introduction

You can use dynamic programming for your problem, and an element dp[i][j] denotes the maximum possible sum reached till the current position i,j. i corresponds to an i-th element from one list and j - to an j-th element from the other array.

Flow

I will Python for drafting an algorithm although the same algorithm will perform much better in other languages like JAVA or C++. With Python, however, it's a bit more readable (imho).

Preparation

For brevity, I assume that your table always has at least 2 rows and columns.

A size of the table of matches and the table of matches can be represented as:

n, m = 4, 3 # n stands for number of rows, and m - columns
matches = [[10, 2, 1],
  [1, 0, 2],
  [2, 0, 0],
  [1, 20, 0]]

Create a dp array initially containing the same elements as matches:

dp = matches

Find maximum sum

Now, to calculate the maximum sum for dp[i][j] we have to consider 4 cases:

dp[i][j] can be a maximum value.
dp[i][j - 1] is a maximum value.
dp[i - 1][j] is the largest.
dp[i - 1][j - 1] + dp[i][j] is the maximum

So, it can be done as below:

for j in range(m):
    for i in range(n):
        top = -1
        if i > 0:
            top = dp[i - 1][j]
        left = -1
        if j > 0:
            left = dp[i][j - 1]
        topLeft = -1
        if i > 0 and j > 0:
            topLeft = dp[i - 1][j - 1]

        dp[i][j] = max(topLeft + dp[i][j], top, left, dp[i][j])

Good, if you print out dp[n - 1][m - 1], you will see the maximum sum as per your task.

Print sequence with maximum sum

Basically, you have to do the reverse procedure to print out the sequence. You should start from dp[n - 1][m - 1] because it denotes the maximum sum.

So, go left and check if an element to the left is different from the maximum value. If it's different then the current element belongs to the sequence only if it's not equal to an element above (that means either dp[i-1][j-1] + dp[i][j] or dp[i][j] is the largest).

Another option is that the leftmost element is also a maximum one. Then, go up to find the origin maximum element and stop the procedure.

So, repeat the main procedure for the upper row until the sequence is built.

solution = []
i = n - 1
j = m - 1
done = False
while i >= 0 and not done:
    current_max = dp[i][j]
    while j > 0:
        if dp[i][j - 1] != current_max:
            break
        j -= 1

    if j <= 0:
        while i > 0 and dp[i][0] == dp[i - 1][0]:
            i -= 1
        solution.append((i, j))
        done = True
        break

    if i == 0 or dp[i][j] != dp[i - 1][j]:
        solution.append((i, j))
        j -= 1

    i -= 1

Full Code

# n - rows
# m - columns
n, m = 4, 3
matches = [[10, 2, 1],
      [1, 0, 2],
      [2, 0, 0],
      [1, 20, 0]]

A = ['a', 'b', 'c', 'd']
B = ['e', 'f', 'g']

dp = matches

for j in range(m):
    for i in range(n):
        top = -1
        if i > 0:
            top = dp[i - 1][j]
        left = -1
        if j > 0:
            left = dp[i][j - 1]
        topLeft = -1
        if i > 0 and j > 0:
            topLeft = dp[i - 1][j - 1]

        dp[i][j] = max(topLeft + dp[i][j], top, left, dp[i][j])

print (dp[n - 1][m - 1])

solution = []
i = n - 1
j = m - 1
done = False
while i >= 0 and not done:
    current_max = dp[i][j]
    while j > 0:
        if dp[i][j - 1] != current_max:
            break
        j -= 1

    if j <= 0:
        while i > 0 and dp[i][0] == dp[i - 1][0]:
            i -= 1
        solution.append((i, j))
        done = True
        break

    if i == 0 or dp[i][j] != dp[i - 1][j]:
        solution.append((i, j))
        j -= 1

    i -= 1

while len(solution) > 0:
    i, j = solution.pop()
    print ('(%s,%s)' % (A[i], B[j]))