Reputation: 206
I have a large file of points and I am trying to find the distance between those points and another set of points. Initially I used the to_crs
function of geopandas to convert the crs so that I can get an accurate distance measure in terms of meters when I do df.distance(point)
. However, since the file is very large, it took too long just to convert the crs of the file. The code was running for 2 hours and it still did not finish converting. Therefore, I used this code instead.
inProj = Proj(init='epsg:4326')
outProj = Proj(init='epsg:4808')
for index, row in demand_indo_gdf.iterrows():
o = Point(row['origin_longitude'], row['origin_latitude'])
o_proj = Point(transform(inProj, outProj, o.x, o.y))
for i, r in bus_indo_gdf.iterrows():
stop = r['geometry']
stop_proj = Point(transform(inProj, outProj, stop.x, stop.y))
print ('distance:', o_proj.distance(stop_proj), '\n\n')
I thought it may be faster to individually convert the crs and carry out my analysis. For this set of points:
o = (106.901024 -6.229162)
stop = (106.804 -6.21861)
I converted this EPSG 4326 coordinates to the local projection, EPSG 4808, and got this:
o_proj = (0.09183386384156803 -6.229330112968891)
stop_proj = (-0.005201753272169649 -6.218776788266844)
This gave a distance measure of 0.09760780527657992. Google maps gave me a distance measure, for coordinates o
and stop
, of 10.79km. It looks like the distance measure of my code gives an answer that is 10^-3 times smaller than the actual distance. Why is that so? Is my code correct?
Upvotes: 0
Views: 1454
Reputation: 7121
The coordinates you are using are in degrees, not in meters.
The Points
class that you are using probably does not care about this and calculates the cartesian distance between them.
Please use the Haversine equation, e.g. the one from here:
from math import radians, cos, sin, asin, sqrt
def haversine(lon1, lat1, lon2, lat2):
"""
Calculate the great circle distance between two points
on the earth (specified in decimal degrees)
"""
# convert decimal degrees to radians
lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])
# haversine formula
dlon = lon2 - lon1
dlat = lat2 - lat1
a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
c = 2 * asin(sqrt(a))
r = 6371 # Radius of earth in kilometers. Use 3956 for miles
return c * r
Then your code returns the correct result:
from pyproj import Proj, transform
inProj = Proj(init='epsg:4326')
outProj = Proj(init='epsg:4808')
o = (106.901024, -6.229162)
stop = (106.804, -6.21861)
o_proj = transform(inProj, outProj, o[0], o[1])
stop_proj = transform(inProj, outProj, stop[0], stop[1])
print(haversine(o_proj[0], o_proj[1], stop_proj[0], stop_proj[1]))
Upvotes: 1