Reputation: 29566
Given a set of closed regions [a,b]
where a
and b
are integers I need to find another set of regions that cover the same numbers but are disjoint.
I suppose it is possible to do naively by iterating through the set several times, but I am looking for a recommendation of a good algorithm for this. Please help.
EDIT:
to clarify, the resulting regions cannot be larger than the original ones, I have to come up with disjoint regions that are contained by the original ones. In other words, I need to split the original regions on the boundaries where they overlap.
example:
3,8
1,4
7,9
11,14
result:
1,3
3,4
4,7
7,8
8,9
11,14
Upvotes: 1
Views: 299
Reputation: 52008
(This is a modification of an answer that I posted earlier today which I deleted after I discovered it had a logic error. I later realized that I could modify Vincent van der Weele's elegant idea of using parenthesis depth to fix the bug)
On Edit: Modified to be able to accept intervals of length 0
Call an interval [a,a] of length 0 essential if a doesn't also appear as an endpoint of any interval of length > 0. For example, in [1,3], [2,2], [3,3], [4,4] the 0-length intervals [2,2] and [4,4] are essential but [3,3] isn't. Inessential 0-length intervals are redundant thus need not appear in the final output. When the list of intervals is initially scanned (loading the basic data structures) points corresponding to 0-length intervals are recorded, as are endpoint of intervals of length > 0. When the scan is completed, two instances of each point corresponding to essential 0-length intervals are added into the list of endpoints, which is then sorted. The resulting data structure is a multiset where the only repetitions correspond to essential 0-length intervals.
For every endpoint in the interval define the pdelta (parentheses delta) of the endpoint as the number of times that point appears as a left endpoint minus the number of times it appears as a right endpoint. Store these in a dictionary keyed by the endpoints.
[a,b] where a,b are the first two elements of the list of endpoints is the first interval in the list of disjoint intervals. Define the parentheses depth of b to be the sum of pdelta[a] and pdelta[b]. We loop through the rest of the endpoints as follows:
In each pass through the loop look at the parenthesis depth of b. If it is not 0 than b is still needed for one more interval. Let a = b and let the new p be the next value in the list. Adjust the parentheses depth be the pdelta of the new b and add [a,b] to the list of disjoint intervals. Otherwise (if the parenthesis depth of b is 0) let the next [a,b] be the next two points in the list and adjust the parenthesis depth accordingly.
Here is a Python implementation:
def disjointify(intervals):
if len(intervals) == 0: return []
pdelta = {}
ends = set()
disjoints = []
onePoints = set() #onePoint intervals
for (a,b) in intervals:
if a == b:
onePoints.add(a)
if not a in pdelta: pdelta[a] = 0
else:
ends.add(a)
ends.add(b)
pdelta[a] = pdelta.setdefault(a,0) + 1
pdelta[b] = pdelta.setdefault(b,0) - 1
onePoints.difference_update(ends)
ends = list(ends)
for a in onePoints:
ends.extend([a,a])
ends.sort()
a = ends[0]
b = ends[1]
pdepth = pdelta[a] + pdelta[b]
i = 1
disjoints.append((a,b))
while i < len(ends) - 1:
if pdepth != 0:
a = b
b = ends[i+1]
pdepth += pdelta[b]
i += 1
else:
a = ends[i+1]
b = ends[i+2]
pdepth += (pdelta[a] + pdelta[b])
i += 2
disjoints.append((a,b))
return disjoints
Sample output which illustrates various edge cases:
>>> example = [(1,1), (1,4), (2,2), (4,4),(5,5), (6,8), (7,9), (10,10)]
>>> disjointify(example)
[(1, 2), (2, 2), (2, 4), (5, 5), (6, 7), (7, 8), (8, 9), (10, 10)]
>>> disjointify([(1,1), (2,2)])
[(1, 1), (2, 2)]
(I am using Python tuples to represent the closed intervals even though it has the minor drawback of looking like the standard mathematical notation for open intervals).
A final remark: referring to the result as a collection of disjoint interval might not be accurate since some of these intervals have nonempty albeit 1-point intersections
Upvotes: 1
Reputation: 13177
Just sort all endpoints left to right (remember their type: start or end). Swype left to right. Keep a counter starting at 0. Whenever you come across a start: increment the counter. When you come across an end: decrement (note that the counter is always at least 0).
Keep track of the last two points. If the counter is greater than zero - and the last two points are different (prevent empty ranges) - add the interval between the last two points.
Pseudocode:
points = all interval endpoints
sort(points)
previous = points[0]
counter = 1
for(int i = 1; i < #points; i++) {
current = points[i]
if (current was start point)
counter++
else
counter--
if (counter > 0 and previous != current)
add (previous, current) to output
previous = current
}
Upvotes: 2