Joey Ngo
Joey Ngo

Reputation: 111

Given a lat/long, find the nearest location based on a json list of lat/long

Given a json file,

{"BusStopCode": "00481", "RoadName": "Woodlands Rd", "Description": "BT PANJANG TEMP BUS PK", "Latitude": 1.383764, "Longitude": 103.7583},
{"BusStopCode": "01012", "RoadName": "Victoria St", "Description": "Hotel Grand Pacific", "Latitude": 1.29684825487647, "Longitude": 103.85253591654006}

, and so on..

of various bus stops, I am trying to find the nearest bus stops based on this list of 5000 bus stops with any user given lat/long using the given formula

import math
R = 6371000 #radius of the Earth in m
x = (lon2 - lon1) * cos(0.5*(lat2+lat1)) 
y = (lat2 - lat1) 
d = R * sqrt( x*x + y*y ) 

My question would be, for user input of lat1 and lon1, how would i be able to compute all distances between lat1 lon1 and lat2 lon2 (where lat2 lon2 will take the value of all 5000 lat/lon in json file), and then print the lowest 5 distances?

I have thought of using list.sort but am not sure of how i am able to compute all 5000 distances using python.

Thank you so much.

Edit:

With the code from Eric Duminil, the following code works for my needs.

from math import cos, sqrt
import sys
import json
busstops = json.loads(open("stops.json").read())
R = 6371000 #radius of the Earth in m 
def distance(lon1, lat1, lon2, lat2): 
  x = (lon2-lon1) * cos(0.5*(lat2+lat1)) 
  y = (lat2-lat1) 
  return R * sqrt( x*x + y*y )
buslist = sorted(busstops, key= lambda d: distance(d["Longitude"], d["Latitude"], 103.5, 1.2))
print(buslist[:5])

where 103.5, 1.2 from buslist is an example user input longitude latitude.

Upvotes: 4

Views: 7353

Answers (2)

Eric Duminil
Eric Duminil

Reputation: 54293

You could simply define a function to calculate the distance and use it to sort bus stops with the key argument:

from math import cos, sqrt, pi

R = 6371000 #radius of the Earth in m
def distance(lon1, lat1, lon2, lat2):
    x = (lon2 - lon1) * cos(0.5*(lat2+lat1))
    y = (lat2 - lat1)
    return (2*pi*R/360) * sqrt( x*x + y*y )

bustops = [{"BusStopCode": "00481", "RoadName": "Woodlands Rd", "Description": "BT PANJANG TEMP BUS PK", "Latitude": 1.383764, "Longitude": 103.7583},
{"BusStopCode": "01012", "RoadName": "Victoria St", "Description": "Hotel Grand Pacific", "Latitude": 1.29684825487647, "Longitude": 103.85253591654006}]

print(sorted(bustops, key= lambda d: distance(d["Longitude"], d["Latitude"], 103.5, 1.2)))
# [{'BusStopCode': '01012', 'RoadName': 'Victoria St', 'Description': 'Hotel Grand Pacific', 'Latitude': 1.29684825487647, 'Longitude': 103.85253591654006}, {'BusStopCode': '00481', 'RoadName': 'Woodlands Rd', 'Description': 'BT PANJANG TEMP BUS PK', 'Latitude': 1.383764, 'Longitude': 103.7583}]

Once this list is sorted, you can simply extract the 5 closest bus stops with [:5]. It should be fast enough, even with 5000 bus stops.

Note that if you don't care about the specific distance but only want to sort bus stops, you could use this function as key:

def distance2(lon1, lat1, lon2, lat2):
    x = (lon2 - lon1) * cos(0.5*(lat2+lat1))
    y = (lat2 - lat1)
    return x*x + y*y

Upvotes: 3

bky
bky

Reputation: 1375

I've done the same for such a project, but calculating all the distances for a large dataset can take a lot of time.

I ended up with knn nearest neighbors which is much faster and you don't need to recalculate the distance all the time:

import numpy as np
from sklearn.neighbors import NearestNeighbors

buslist = [{ ...., 'latitude':45.5, 'longitude':7.6}, { ...., 'latitude':48.532, 'longitude':7.451}]

buslist_coords = np.array([[x['latitude'], x['longitude']] for x in buslist]) #extracting x,y coordinates

# training the knn with the xy coordinates
knn = NearestNeighbors(n_neighbors=num_connections)
knn.fit(buslist_coords)
distances, indices = knn.kneighbors(xy_coordinates)
# you can pickle these and load them later to determinate the nearest point to an user


# finding the nearest point for a given coordinate
userlocation = [47.456, 6.25]
userlocation = np.array([[userlocation[0], userlocation[1]]])
distances, indices = knn.kneighbors(userlocation)

# get the 5 nearest stations in a list
nearest_stations = buslist[indices[0][:5]] # the order of the buslist must be the same when training the knn and finding the nearest point

# printing the 5 nearest stations
for station in nearest_stations :
    print(station)

After that, I built a graph with networkx with these data, but I'm still using knn.kneighbors(userlocation) to find the nearest point of an user.

Upvotes: 1

Related Questions