Andrei
Andrei

Reputation: 1373

How to find the nearest neighbors for latitude and longitude point on python?

Input:

point = (lat, long)
places = [(lat1, long1), (lat2, long2), ..., (latN, longN)]
count = L

Output: neighbors = subset of places close to the point. (len(neighbors)=L)

Question: Can I use kd-tree for quick nearest-neighbors lookup for points with latitude and longitude? (For example, implementation in scipy)

Is it necessary to transform the geographical coordinates (latitude and longitude) of the point in the coordinates x,y?

Is it the best way to solve this?

Upvotes: 14

Views: 16243

Answers (3)

Florian Brucker
Florian Brucker

Reputation: 10355

scikit-learn provides a BallTree class that supports the Haversine metric. See also this SO question.

Upvotes: 6

Marcel Wilson
Marcel Wilson

Reputation: 4572

I honestly don't know if using a kd-tree would work correctly, but my hunch says it would be inaccurate.

I think you need to use something like greater circle distance to get accurate distances.


from math import radians, cos, sin, asin, sqrt, degrees, atan2

def validate_point(p):
    lat, lon = p
    assert -90 <= lat <= 90, "bad latitude"
    assert -180 <= lon <= 180, "bad longitude"

# original formula from  http://www.movable-type.co.uk/scripts/latlong.html
def distance_haversine(p1, p2):
    """
    Calculate the great circle distance between two points 
    on the earth (specified in decimal degrees)
    Haversine
    formula: 
        a = sin²(Δφ/2) + cos φ1 ⋅ cos φ2 ⋅ sin²(Δλ/2)
                        _   ____
        c = 2 ⋅ atan2( √a, √(1−a) )
        d = R ⋅ c

    where   φ is latitude, λ is longitude, R is earth’s radius (mean radius = 6,371km);
            note that angles need to be in radians to pass to trig functions!
    """
    lat1, lon1 = p1
    lat2, lon2 = p2
    for p in [p1, p2]:
        validate_point(p)

    R = 6371 # km - earths's radius

    # convert decimal degrees to radians 
    lat1, lon1, lat2, lon2 = map(radians, [lat1, lon1, lat2, lon2])

    # haversine formula 
    dlon = lon2 - lon1
    dlat = lat2 - lat1

    a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
    c = 2 * asin(sqrt(a)) # 2 * atan2(sqrt(a), sqrt(1-a))
    d = R * c
    return d

Upvotes: 6

gsamaras
gsamaras

Reputation: 73366

I think you are trying to solve the k Nearest Neighbor problem.

Since your data set lies in 2D, then a kd-tree would work nicely, in general, I do not know about spicy.

However, if your points start living in a higher dimension, then kd-tree will not be a smart choice.

Upvotes: 1

Related Questions