SalmaFG
SalmaFG

Reputation: 2192

Fast way to find the closest polygon to a point

I have a list of Shapely polygons and a point like so:

from shapely.geometry import Point, Polygon

polygons = [Polygon(...), Polygon(...), ...]
point = Point(2.5, 5.7)

and I want to find the closest polygon in the list to that point. I'm already aware of the object.distance(other) function which returns the minimum distance between two geometric shapes, and I thought about computing all the distances in a loop to find the closest polygon:

polygons = [Polygon(...), Polygon(...), ...]
point = Point(2.5, 5.7)
min_dist = 10000
closest_polygon = None

for polygon in polygons:
    dist = polygon.distance(point)
    if dist < min_dist:
        min_dist = dist
        closest_polygon = polygon

My question is: Is there a more efficient way to do it?

Upvotes: 9

Views: 7522

Answers (3)

Drey
Drey

Reputation: 3364

There is a shorter way, e.g.

from shapely.geometry import Point, Polygon
import random
from operator import itemgetter


def random_coords(n):
    return [(random.randint(0, 100), random.randint(0, 100)) for _ in range(n)]


polys = [Polygon(random_coords(3)) for _ in range(4)]
point = Point(random_coords(1))

min_distance, min_poly = min(((poly.distance(point), poly) for poly in polys), key=itemgetter(0))

as Georgy mentioned (++awesome!) even more concise:

min_poly = min(polys, key=point.distance)

but distance computation is in general computationally intensive

Upvotes: 2

ma3oun
ma3oun

Reputation: 3790

I have a solution that works if you have at least 2 polygons with a distance different from 0. Let's call these 2 polygons "basePolygon0" and "basePolygon1". The idea is to build a KD tree with the distance of each polygon to each of the "basis" polygons. Once the KD tree has been built, we can query it by computing the distance to each of the basis polygons.

Here's a working example:

from shapely.geometry import Point, Polygon
import numpy as np
from scipy.spatial import KDTree 

# prepare a test with triangles
poly0 = Polygon([(3,-1),(5,-1),(4,2)])
poly1 = Polygon([(-2,1),(-4,2),(-3,4)])
poly2 = Polygon([(-3,-3),(-4,-6),(-2,-6)])
poly3 = Polygon([(-1,-4),(1,-4),(0,-1)])

polys = [poly0,poly1,poly2,poly3]

p0 = Point(4,-3)
p1 = Point(-4,1)
p2 = Point(-4,-2)
p3 = Point(0,-2.5)

testPoints = [p0,p1,p2,p3]

# select basis polygons
# it works with any pair of polygons that have non zero distance
basePolygon0 = polys[0]
basePolygon1 = polys[1]

# compute tree query
def buildQuery(point):
    distToBasePolygon0 = basePolygon0.distance(point)
    distToBasePolygon1 = basePolygon1.distance(point)
    return np.array([distToBasePolygon0,distToBasePolygon1])


distances = np.array([buildQuery(poly) for poly in polys])

# build the KD tree
tree = KDTree(distances)

# test it
for p in testPoints:
    q = buildQuery(p)
    output = tree.query(q)
    print(output)

This yields as expected:

# (distance, polygon_index_in_KD_tree)
(2.0248456731316584, 0)
(1.904237866994273, 1)
(1.5991500555008626, 2)
(1.5109986459170694, 3)

Upvotes: 1

Calvin Godfrey
Calvin Godfrey

Reputation: 2359

There is one way that might be faster, but without doing any actual tests, it's hard for me to say for sure.

This might not work for your situation, but the basic idea is each time a Shapely object is added to the array, you adjust the position of different array elements so that it is always "sorted" in this manner. In Python, this can be done with the heapq module. The only issue with that module is that it's hard to choose a function to compare to different objects, so you would have to do something like this answer, where you make a custom Class for objects to put in the heapq that is a tuple.

import heapq

class MyHeap(object):
   def __init__(self, initial=None, key=lambda x:x):
       self.key = key
       if initial:
           self._data = [(key(item), item) for item in initial]
           heapq.heapify(self._data)
       else:
           self._data = []

   def push(self, item):
       heapq.heappush(self._data, (self.key(item), item))

   def pop(self):
       return heapq.heappop(self._data)[1]

The first element in the tuple is a "key", which in this case would be the distance to the point, and then the second element would be the actual Shapely object, and you could use it like so:

point = Point(2.5, 5.7)
heap = MyHeap(initial=None, key=lambda x:x.distance(point))

heap.push(Polygon(...))
heap.push(Polygon(...))
# etc...

And at the end, the object you're looking for will be at heap.pop().

Ultimately, though, both algorithms seem to be (roughly) O(n), so any speed up would not be a significant one.

Upvotes: 0

Related Questions