Reputation: 13079
I have a number-line between 0 to 1000. I have many line segments on the number line. All line segments' x1 is >= 0 and all x2 are < 1000. All x1 and x2 are integers.
I need to find all of the unions of the line segments.
In this image, the line segments are in blue and the unions are in red:
Is there an existing algorithm for this type of problem?
Upvotes: 3
Views: 2031
Reputation: 21
(Although the question was asked a while ago, the answer will hopefully help all future seekers.)
Marzullo's Algorithm (Wikipedia link) is for a useful intersection of 1D intervals, while Klee's Algorithm (Wikipedia link, currently in draft) computes their union as a disjointed set. There is a verbose JS implementation there, but here's a more compact version:
function union(intervals = []) {
// the input is expected to be an array of intervals,
// with each interval being a 2-element array
const endpoints = [];
for (let interval of intervals) {
endpoints.push({v: interval[0], nesting: +1});
endpoints.push({v: interval[1], nesting: -1});
}
endpoints.sort((a, b) => a.v - b.v + (a.v !== b.v ? 0 : a.nesting === -1 ? 1 : 0));
const disjointIntervals = [];
let nestingLevel = 0;
let curInterval;
for (let endpoint of endpoints) {
if (nestingLevel === 0) disjointIntervals.push(curInterval = [endpoint.v]);
nestingLevel += endpoint.nesting;
if (nestingLevel === 0) curInterval.push(endpoint.v);
}
return disjointIntervals;
}
Upvotes: 2
Reputation: 46
You can use marzullo's algorithm (see Wikipedia for more details). Here is a Python implementation I wrote:
def ip_ranges_grouping(range_lst):
## Based on Marzullo's algorithm
## Input: list of IP ranges
## Returns a new merged list of IP ranges
table = []
for rng in range_lst:
start,end = rng.split('-')
table.append((ip2int(start),1))
table.append((ip2int(end),-1))
table.sort(key=lambda x: x[0])
for i in range(len(table) - 1):
if((table[i][0] == table[i+1][0]) and ((table[i][1] == -1) and (table[i+1][1] == 1))):
table[i],table[i+1] = table[i+1],table[i]
merged = []
end = count = 0
while (end < len(table)):
start = end
count += table[end][1]
while(count > 0): # upon last index, count == 0 and loop terminates
end += 1
count += table[end][1]
merged.append(int2ip(table[start][0]) + '-' + int2ip(table[end][0]))
end += 1
return merged
Upvotes: 2
Reputation: 428
If the segments are not changed dynamically, it is a simple problem. Just sorting all the segments by the left end, then scanning the sorted elements:
struct Seg {int L,R;};
int cmp(Seg a, Seg b) {return a.L < b.L;}
int union_segs(int n, Seg *segs, Segs *output) {
sort(segs, segs + n, cmp);
int right_most = -1;
int cnt = 0;
for (int i = 0 ; i < n ; i++) {
if (segs[i].L > right_most) {
right_most = segs[i].R;
++cnt;
output[cnt].L = segs[i].L;
output[cnt].R = segs[i].R;
}
if (segs[i].R > right_most) {
right_most = segs[i].R;
output[cnt].R = segs[i].R;
}
}
return cnt+1;
}
The time complexity is O(nlogn) (sorting) + O(n) (scan).
If the segments are inserted and deleted dynamically, and you want to query the union at any time, you will need some more complicated data structures such as range tree.
Upvotes: 1
Reputation: 776
Considering that the coordinates of your segments are bounded ([0, 1000]) integers, you could use an array of size 1000 initialized with zeroes. You then run through your set of segments and set 1 on every cell of the array that the segment covers. You then only have to run through the array to check for contigous sequences of 1.
--- -----
--- ---
1111100111111100
The complexity depends on the number of segments but also on their length.
Here is another method, which also work for floating point segments. Sort the segments. You then only have to travel the sorted segments and compare the boundaries of each adjacent segments. If they cross, they are in the same union.
Upvotes: 1