Reputation: 625
Say we have two lists of the same length, ls1
and ls2
. For example, we have
ls1 = [4, 6]
ls2 = [3, 5]
and for each element in ls1
we have to pair it with one element with one element in ls2
, in a manner such that the total (absolute) difference between the element in ls1
and the element in ls2
is minimal. One element can only be matched once. In the example above, the optimal way is to match 4
is ls1
with 3
in ls2
, and 5
in ls1
with 6
in ls2
, which yields a total difference of
(4 - 3) + (6 - 5) = 2
I need a program to return this minimum total difference between the elements in these two lists. The length of the lists is arbitrary, and so is the values of the elements in the lists (but they are all positive integers).
I currently know that using permutation to brute force the solution is an option, but what I need is a code that has optimal time and space complexity. I heard of the concept of dynamic programming, but I have no idea of how to implement it in my case. Thanks in advance for replies.
Ps. Here's my current brute force code using permutation, which is not efficient in runtime or memory usage:
from itertools import permutations
def minimum_total_difference(ls1, ls2):
length = len(ls1)
possibilities = list(permuations(ls1, length))
out = 10**10
for possibility in possibilities:
out_ = 0
for _ in range(length):
diff = abs(possibility[_] - ls2[_])
out += diff
if out_ < out:
out = out_
return out
Upvotes: 4
Views: 3914
Reputation: 625
As @kraskevich has pointed out, the correct answer is indeed:
sum(abs(x - y) for x, y in zip(sorted(xs), sorted(ys))
I have come up with my own proof:
Consider two lists xs
and ys
consisting of elements, in random order, x1
, x2
, ... xn
and y1
, y2
, ... yn
.
Since we're trying to find out the minimal sum of absolute difference, we could use square root instead of absolute values, which has little impact to find the minimum value.
Therefore the sum of the differences is:
(x1 - y1)^2 + (x2 - y2)^2 + ... + (xn - yn)^2
=x1^2 - 2x1 * y1 + y1^2 + ... + xn^2 - 2xn * yn + yn^2
As we can see, in whatever manner we arrange the two lists, the quadratic terms xn^2 and yn^2 stay the same. So, to obtain a minimal result, we simply have to maximize the negative terms -2xn * yn.
In order to do so, we simply multiply the largest value in one list by the largest value in the other list, and then do the same for the second largest value in the two lists, etc. (see Given 2 arrays, find the minimal sum of multiplication of indexes). Hence, by sorting both lists and multiplying elements of the same index in the sorted lists, we obtain the minimal sum of differences.
Upvotes: 1
Reputation: 18556
One can prove that the optimal solution is to sort both lists and match their elements in sorted order.
A sketch of the proof:
Let there be an inversion, that is a
is matched to b
, c
is matched to d
and a < c, b > d
.
We can "exchange" these elements: a->d, c->b
. Now a < c, d < b
. One can show that this operation never increases the answer (by considering all possible relative values of a, b, c, d
)
Thus, there is always an optimal matching where both lists are sorted.
Here's an efficient one-liner that implements this solution:
sum(abs(x - y) for x, y in zip(sorted(xs), sorted(ys)))
Upvotes: 8