Veneet Reddy
Veneet Reddy

Reputation: 2917

Efficient algorithm to find rectangles from lines?

Given a list of paths(with start and end coordinates) that are only horizontal and vertical, how can I find all the rectangles that they form?

Details:

This is my naive implementation which is too slow. Are there any other approaches?

def find_rectangles(paths):
    vertical_paths = filter(lambda path: is_vertical(path), paths)
    horizontal_paths = filter(lambda path: is_horizontal(path), paths)

    vertical_paths.sort(key=lambda path: path.start.imag, reverse=True)
    horizontal_paths.sort(key=lambda path: path.start.real)

    h_pairs = []
    for i in range(0, len(horizontal_paths) - 1):
        for j in range(1, len(horizontal_paths)):
            if horizontal_paths[i].start.real == horizontal_paths[j].start.real and horizontal_paths[i].end.real == horizontal_paths[j].end.real:
                h_pairs.append((horizontal_paths[i], horizontal_paths[j]))

    v_pairs = []
    for i in range(0, len(vertical_paths) - 1):
        for j in range(1, len(vertical_paths)):
            if vertical_paths[i].start.imag == vertical_paths[j].start.imag and vertical_paths[i].end.imag == vertical_paths[j].end.imag:
                v_pairs.append((vertical_paths[i], vertical_paths[j]))


    rects = []
    for h1, h2 in h_pairs:
        for v1, v2 in v_pairs:
            if h1.start == v1.start and v1.end == h2.start and h1.end == v2.start and h2.end == v2.end:
                rects.append(Rect(h1.start, h1.end, h2.end, h2.start))

    return rects

Edit:(Improvements from all suggestions)

Main difference is I am storing all horizontal edges' end-points in a set so look up for them is O(1):

def find_rectangles(paths):
    vertical_paths = [path for path in paths if is_vertical(path)]
    horizontal_paths_set = set([(path.start, path.end) for path in paths if is_horizontal(path)])

    vertical_paths.sort(key=lambda pair: path.start.imag, reverse=True)

    v_pairs = [pair for pair in list(itertools.combinations(vertical_paths, 2)) if pair[0].start.imag == pair[1].start.imag and pair[0].end.imag == pair[1].end.imag]

    rects = []
    for v1,v2 in v_pairs:
        h1 = (v1.start, v2.start)
        h2 = (v1.end, v2.end)

        if(h1 in horizontal_paths_set and h2 in horizontal_paths_set):
            rects.append(Rect(h1[0], h1[1], h2[1], h2[0 ]))

    return rects

My new code runs much much faster, but it still is on the order of O(n2). Any suggestion to improve upon it is welcome.

Upvotes: 1

Views: 1489

Answers (1)

Rogier Polak
Rogier Polak

Reputation: 126

You can drop the search for v_pairs to begin with. You only need to know if the potential rectangle (a horizontal pair) can be closed.

def find_rectangles(paths):
    vertical_paths = filter(lambda path: is_vertical(path), paths)
    horizontal_paths = filter(lambda path: is_horizontal(path), paths)

    vertical_paths.sort(key=lambda path: path.start.imag, reverse=True)
    horizontal_paths.sort(key=lambda path: path.start.real)

    potential_rectangles = []
    for i,h1 in enumerate(horizontal_paths[:-1]):
        for h2 in horizontal_paths[i+1:]:
            if ((h1.start.real == h2.start.real) 
                    and (h1.end.real == h2.end.real)):
                potential_rectangles.append((h1,h2,None,None))

    rectangles = []
    for v in vertical_paths:
        for i,(h1,h2,v1,v2) in enumerate(potential_rectangles):
            if v1 is None and v.start == h1.start and v.end == h2.start:
                potential_rectangles[i][2] = v
                if v2 is not None:
                    rectangles.append(potential_rectangles.pop(i))
                break
            if v2 is None and v.start == h1.end and v.end == h2.end:
                potential_rectangles[i][3] = v
                if v1 is not None:
                    rectangles.append(potential_rectangles.pop(i))
                break

    return rectangles

There certainly is a lot of potential to speed up the selection, dependent on the data received. For instance, sorting paths by length. Can you give more details?

After Edit Bisect is a lot faster than 'is in', but it requires an ordered list of scalar values.

Upvotes: 1

Related Questions