pearson boy
pearson boy

Reputation: 69

Are there any efficient algorithm for "top hull"?

for example, when points are given in form of (x-left, x-right, y) in x,y coordinate, ( 1 , 5 , 3), (2 , 4, 5) returns (1,2,3) , (2 ,4 ,5), (4,5,3)

Upvotes: 0

Views: 382

Answers (3)

Matt Timmermans
Matt Timmermans

Reputation: 59174

The simplest efficient ( O(N log N) ) way to solve this problem is with a "line sweep" algorithm.

Imagine sweeping a vertical line from left to right across the whole set, keeping track of the top-most segment that it crosses. Whenever the top-most segment changes, it is an important event that could affect the top-hull. It would be easy to calculate the top-hull from a list of these changes.

Notice that these important events can only happen at the x positions where one of the input segments starts or ends. Instead of smoothly sweeping the line, therefore, we only need to consider what happens at these x positions. So:

  1. Generate a list of all the start and end events for the segments. For a segment { 1, 2, 5}, for example would generate events {1, start 5}, {2, end 5}
  2. Create a max heap to keep track of the top-most segment.
  3. Sort all the events by x position, and process them in order. For events with the same x position, do the start events first. This ensures that you process each segment's start event before its end event.
  4. For each x position in the list, process all the events with that x position as a unit. For each start event {x, start y} add y to the heap. For each event {x, end y}, remove y from the heap.
  5. When you're done processing the events for an x position, the biggest element in the heap is the y position of the top-hull until the next x position in the list.

Upvotes: 2

Dillon Davis
Dillon Davis

Reputation: 7740

Prunes answer has the right idea, but I feel it doesn't do justice to explaining how to check interval overlap. In fact, that portion of the algorithm runs in quadratic time O(n^2) since it forms all n^2 pairs at some point, which is unnecessary. What I would do-

Queuing

First, create a max-heap from your list of segments, with the y-coordinate as its key. You can extract and remove the max in O(logn) time, so this plays out to be the same time complexity as sorting, just with popping builtin.

heap = max_heap(segement_list)
output = []
while heap is not empty
    segment = heap.pop() # max / highest

    # trim / split segment
    # append trimmed segment(s)

Intersections

Now we just need to trim the segment. Instead of pairing it against every other segment and trimming them as necessary, we'll use another data structure to allow us to quickly query for potential intersections. We'll store each added segment in a binary search tree, with the lower x-coordinate as its key. We can then traverse this tree looking for the largest segment (by lower x-coordinate) less than or equal to the segment we are about to add.

For the sake of making the following paragraphs less bloated with technical details, lets just get the implementation details of two key comparisons out of the way. Assume segment a has a smaller lower_x value than b (because in the following paragraph, we'll always know which is smaller).

# boolean- do `a` and `b` intersect
function intersects(a, b)
    return a.upper_x >= b.lower_x

# boolean- is `b` a subsegment of `a`
function is_subsegment(a, b)
    return a.upper_x >= b.upper_x

We'll also need three transformations, using the same a and b definition-

function merge(a, b)
    a.upper_x = b.upper_x

function trim_left(a, b)
    a.upper_x = b.lower_x

function trim_right(a, b)
    b.lower_x = a.upper_x

Returning to the idea of querying a BST- take left_segment obtained when querying for our segment segment, and see if they intersect. If they intersect, check if segment is_subsegment of left_segment. If it is, abort and continue on to the next segment in the heap. Otherwise, provided they intersect, we'll need to trim_right segment. Regardless of intersection or not, we'll then handle any right side intersections. Afterwards, we can merge the modified segment (actually subsegment, as you'll see) with left_segment if they overlapped.

The left_segment is the only segment that can overlap from the left, because we merge all overlapping segments as they are inserted into the BST. However, this does not hold for the right side, since our segment has yet to be trimmed from the right. We'll need to handle trimming incrementally.

Take right_segment to be the next segment after left_segment in the tree (in-order traversal). Make a copy of segment called subsegment. If subsegment intersects right_segment, trim_left it, insert subsegment into the output array, merge the two segments, and remove right_segment from the BST. Otherwise, just insert subsegment into the array. Now we may merge subsegment with left_segment, if they overlapped. If they did not, insert subsegment into the BST, and assign the variable left_segment to it.

Now we repeat this process, until we break out from segment being a is_subsegment of left_segment. Afterwards, we repeat the entire process with the next segment from the heap.

Analysis

We know that forming our max-heap and popping the max n times will result in O(nlogn) time complexity. The tricky part is figuring out the time complexity for processing the intersections. Observe that for each segment we handle, after all subsegments have been handled and merged, we will have increased the size of our BST overall by at most one. This is because all of our subsegments get merged together each iteration, so they end up forming one big segment. This means our BST is no larger than n, so querying into and removing / inserting into the BST takes O(logn) time.

Also note that for each segment inserted into the BST, it will only every intersect another segment once- because when it does, the two (or more) will merge into a single new segment. The exception to this is when a segment is a subsegment of its left_segment, but in that case we abort without emitting any new segments anyways, so its +0 net change in size anyways. Using this knowledge, combined with the prior observation that each segment ultimately contributes at most one new segment to the BST, we can conclude that there will be at most O(n) intersections, and thus insertions / deletions. So O(nlogn) time to maintain our BST.

Given the rest of our operations are constant time, our overall time complexity is O(nlogn), instead of O(n^2) when brute forcing for intersections and trimming.

Upvotes: 2

Prune
Prune

Reputation: 77837

A straightforward greedy algorithm solves this neatly. Sort your segments by y-coordinate, descending; ca;; this list seg. Now ...

top_hull = [empty set]
while seg is not empty
    head = seg.pop()    // pop off the highest segment.
    top_hull += head
    for each segment in seg
        remove the interval (head.xleft, head.y-left) from segment

Note that interval removal can some in a few cases:

`head`'s interval does not overlap `segment`
    no action
`head`'s interval contains `segment`
    remove segment
`segments`'s interval contains `head` (both ends stick out
    split segment into the two remaining pieces.
`head`'s interval overlaps one end of `segment`
    truncate segment

Depending on your implementation language, there may be excellent support packages for interval algebra.

Upvotes: 4

Related Questions