Reputation: 683
Suppose I have 2 unsorted arrays of the same size, for example:
A = {1, 4, 3, 2}
B = {5, 12, 1, 5}
I would like to find the minimal sum of multiplying of every 2 cells - one from each array, meaning - the sum of A[i] * B[j]
(i
is an index in A
, j
is an index in B
). which values should I multiply with which values from the other array in order to get that minimal sum of products?
(I hope it's clear that once you performed A[i]*A[j]
you can't touch those cells again...)
edit: for the example above, the minimal sum is: 1*4 + 3*5 + 5*2 + 1*12 = 31
Upvotes: 6
Views: 4657
Reputation: 11195
This theorem is named Rearrangement inequality
Made up proof:
There is a simplest proof by contradiction than the one of @j_random_hacker.
Let's assume that a is sorted in increasing order.
As each cell is never touch more than one, one choice correspond to a permutation s
and the total sum is f(s) = sum(a[i]*b[s(i)], i=1..n)
As the number of permutation is finite, there is at least one minimum.
Let's name s1
one of this minimum. so f(s1)<=f(s)
for all s
We can transform it in a permutation s2
with same sum f(s1)=f(s2)<=f(s)
such as if a[i] = a[i+1] then b[s2(i+1)] >= b[s2(i)] (1)
by simply sorting in decreasing order the sub array b[s1(k)] .. b[s1(k+n)] of b where a[k] = a[k+1] = .. q[k+n]
So f(s2)
is minimal too. Let's demonstrate that b(s2[k])
is decreasing by contradiction.
Let's assume that it exists i
such as b[s2(i)] < b[s2(i+1)] (2)
because of (1), a[i]!= a[i+1]
so a[i] < a[i+1] (3)
so b[s2(i)] - b[s2(i+1)] < 0 (4)
as a consequence of (2)
so a[i] - a[i+1] < 0 (5)
as a consequence of (3)
in this case
a[i]*b[s2(i)]+a[i+1]*b[s2(i+1)]-(a[i]*b[s2(i+1)]+a[i+1]*b[s2(i)])
=a[i]*(b[s2(i)]-b[s2(i+1)])+a[i+1](b[s2(i+1)-b[s2(i)])
=(a[i]-a[i+1])(b[s2(i)]-b[s2(i+1)]) >=0 (because of product of two negative number because of (4) and (5)
so a[i]*b[s2(i)]+a[i+1]*b[s2(i+1)] - (a[i]*b[s2(i+1)]+a[i+1]*b[s2(i)])>0
so a[i]*b[s2(i)]+a[i+1]*b[s2(i+1)] > (a[i]*b[s2(i+1)]+a[i+1]*b[s2(i)]) (6)
Let's set the permutation s3
swapping i
and i+1
s3(n) = i+1 if n = i
i if n = i+1
s2(n)
a[i]*b[s2(i)]+a[i+1]*b[s2(i+1)] > (a[i]*b[s3(i)]+a[i+1]*b[s3(i+1)])
and
a[n]*b[s2(n)]=s[n]*b[s3(n)] for n!= i and n!=i+1
so f(s2)>f(s3)
So s3
is a better permutation than s2
There is a contradiction so the hypothesis (2)
is false.
So for all i
b[s2(i)] < b[s2(i+1)] (2)
is false
So b[s2(i)] >= b[s2(i+1)] for all i
So s2
is such as b[s2(k)]
is in decreasing order. So choosing a decreasing for b give the optimal result.
By similar proof one can prove that the maximum sum of product is given by sorting a and b in increasing order.
Upvotes: 1
Reputation: 51226
Aravol gives some intuition as to why the reverse-sorted matching is optimal, but with this kind of thing I think it's useful to really rigorously prove that it always works.
One way to prove this is to show that, given any matching that isn't already reverse-sorted, changing that matching slightly to make it "more like" a reverse-sorted matching never increases the total. If we can show that this process
then we have shown that the reverse-sorted matching is no worse than any other matching -- which is the same as showing that it has the lowest possible total. I want to emphasise that none of the following computations are actually performed by any algorithm; for the purposes of the proof, all we have to show is that they could be performed, and that they would have the desired behaviour if so.
To that end, let's suppose we are given an arbitrary matching in the form of a set of pairs. As a first step, let's sort them in increasing order by their first element. (We can do this because the order of listing the pairs doesn't affect their sum.) Call the ith pair after sorting (x[i], y[i]); by definition, after sorting we have x[i] <= x[i+1] for all i. Now if the matching is not already reverse-sorted, there must be a pair of adjacent y values that violate the reverse-sortedness -- i.e. there must be some i such that y[i] < y[i+1]. Let's pick the first (lowest) value of i for which this occurs, and see how swapping y[i] and y[i+1] will affect the total. To do that, we can just subtract out the contribution of the two old pairs, and add in the contribution of the two new pairs: x[i]y[i+1] + x[i+1]y[i] - x[i]y[i] - x[i+1]y[i+1] = x[i](y[i+1] - y[i]) + x[i+1](y[i] - y[i+1]). Letting d = y[i+1] - y[i], this simplifies to x[i]d - x[i+1]d = d(x[i] - x[i+1]). We know x[i] - x[i+1] <= 0, and because of how we chose i we also know d > 0, so their product must be <= 0. In other words, performing this swap never increases the total.
But does it get us any "closer" to the reverse-sorted matching, and thus to terminating? Yes, because repeating this find-violation-then-swap procedure behaves just like insertion-sorting the first i+1 elements. Let's call the original value of y[i+1] z. After swapping y[i] and y[i+1], the z value has just moved back one place in the list of pairs. If we rerun the process of looking for the first reverse-sorting violation, we might now find that it occurs at i-1 (i.e. we find that y[i-1] < y[i]) -- in that case, we can swap y[i-1] with y[i], moving the z value back another place. We can keep doing this until either z arrives at y[1], or we find that the first position j such that y[j] < y[j+1] also has the property that y[j-1] >= y[j+1]: the latter means that the z value has arrived in its final position, because we know that y[k] >= y[j-1] for all k < j-1. In any case, z will arrive in its final position after at most i swaps, and the next violation-finding run will find a position later than the original i -- or determine that there is no such position (i.e. that the y values are now reverse-sorted).
All in all, after at most n^2 find-violation-then-swap operations, the y values will be in reverse sorted order, and the total will be at most the original total. Since we made no assumptions about the initial matching we were given, this result applies to any given matching, so we have proven that the reverse-sorted matching is at least as good as all of them. (Notice that no aspect of this proof depends on the numbers being nonnegative, so this will apply to arrays containing negative numbers too.)
Upvotes: 3
Reputation: 10708
In order to find the smallest summation, order one set ascending, the other descending, and then multiply along the like indecies.
This is because the smallest possible product for each pairing a[i] * b[j]
(for a fixed a[i]
due to the need to have a result for each element) is the smallest possible value of b[j]
, and vice-versa.
This will also work with negatives because the greatest negative is the smallest number, and thus pairs with the most positive number from the corollary array. This further continues even when both sets are completely negative, because the result of each multiplication become equivalent to when both sets have the negatives of their values.
Upvotes: 6