Mathematics Lover
Mathematics Lover

Reputation: 195

Fast algorithm for selecting two intervals from this set

Suppose we are given a set of closed intervals where each interval is in the form of [l,r]. If we want to select two intervals from this set such that the size of their intersection times the size of their union is the maximum. Can we provide a nontrivial algorithm to solve this problem?

For example, if we have four intervals, [1,6], [4,8], [2,7], [3,5]. The optimum solution is to select [1,6] and [2,7]. The answer is (7-1) * (6-2) = 24.

Actually the original problem requires us to select (N>=2) number of intervals but I think we can prove that the optimal solution only consists of two intervals:

If the optimum solution has three or more intervals:

[                     ]
            [               ]
                   [                          ]

We can see that the weight won't decrease if we delete the middle interval.

Upvotes: 7

Views: 601

Answers (2)

Lottery Discountz
Lottery Discountz

Reputation: 26

Proof that two intervals suffice: there is no point in choosing an interval that is properly contained in another interval. Without loss of generality, then, let the intervals be [a1, b1], ..., [an, bn] such that a1 < ... < an. If no interval properly contains another interval, then b1 < ... < bn. For i < j < k, it holds that ([ai, bi] intersect [aj, bj] intersect [ak, bk]) = ([ai, bi] intersect [ak, bk]) and the same for union, so there is no reason to choose more than two intervals.

O(n log n)-time algorithm: reformulated, the problem is to find intervals [a, b] and [c, d] maximizing (d - a) * (b - c), since this product is negative iff the intervals do not intersect. Our algorithm is to do O(n log n) preprocessing that allows us to find the best mate for each interval in O(log n) time.

Let's work on finding the best mate for [a, b]. Do some algebra: (d - a) * (b - c) = d*b - d*c - a*b + a*c. Since a, b are fixed, we can drop the -a*b term and maximize the inner product <(a, b, 1), (d, c, -d*c)> over all intervals [c, d]. Since the set of vectors (d, c, -d*c) is fixed, this is essentially simulating the collision of a stationary polyhedron and a moving plane normal to (a, b, 1). Thanks to Edelsbrunner and Maurer (Finding extreme points in three dimensions and solving the post-office problem in the plane, 1984), there's an algorithm that preprocesses in time O(n log n) and solves queries of this type for different a and b in O(log n) time.

One sour detail is that we must choose at least two intervals but the best "solution" may be just the longest interval with itself. I'm confident that it's messy but possible to extend Edelsbrunner--Maurer to find the second most extreme point in the same running time.

Upvotes: 0

mcdowella
mcdowella

Reputation: 19601

Given a set of N > 2 overlapping intervals which supposedly maximises union times intersection, set aside an interval containing the leftmost point in the union and an interval containing the rightmost point in the union. Since N > 2 you have at least one other interval left. If you remove this interval from the set, you do not decrease the size of the union of intervals, because you set aside intervals to cover the leftmost and rightmost points. You can only increase the size of the intersection by removing an interval. So by removing this interval you can only increase the product you are trying to maximise, so the best solution can indeed be found at N = 2.

Sort the set of endpoints of intervals and go through it in increasing order. In case of ties, consider leftmost points before rightmost points. Keep track of a set of intervals, adding an interval to the set when you see its leftmost point, and removing an interval from the set when you see its rightmost point.

For any two overlapping intervals, there will be a point when one of them is already present and you are just about to add the other one. So if, just before you add an interval to the set, you compare it with all other intervals already in the set, you can compare all pairs of overlapping intervals. You can therefore compute the product of union and intersection between the interval about to be added and all other intervals in the set and keep track of the largest one seen.

Upvotes: 2

Related Questions