ANRIOS2020
ANRIOS2020

Reputation: 45

Calculate distance in large list of coordinates

I'd like some help with a task. I'm a Python begginer and I'm trying to calculate the distance between sequential items. For ex. item1 to item2 then item2 to item3 and so on.

There's only one problem, in my dataframe I must partition these calculations to the field ZCGUNLEIT as it indicates a route. So any ZCGUNLEIT will have ~300 coodinates, and I must know the distance between these 300 coodinates and then move on to the next ZCGUNLEIT.

I tried haversine library but couldn't understand how to integrate that to my dataframe.

If anyone can shed some light here, it will be appreciated.

OBS: This dataframe has millions of rows.

list of items with lat and long

Upvotes: 1

Views: 2077

Answers (1)

Ran A
Ran A

Reputation: 774

from answer in this question : Getting distance between two points based on latitude/longitude

the Haversine formula which assumes the earth is a sphere, which results in errors of up to about 0.5% (according to help(geopy.distance)). Vincenty distance uses more accurate ellipsoidal models such as WGS-84, and is implemented in geopy. For example,

import geopy.distance

coords_1 = (52.2296756, 21.0122287)
coords_2 = (52.406374, 16.9251681)

print geopy.distance.vincenty(coords_1, coords_2).km

will print the distance of 279.352901604 kilometers using the default ellipsoid WGS-84. (You can also choose .miles or one of several other distance units).

so for your question, if your data is defined as pandas dataFrame, as an example:

import geopy.distance
import pandas as pd
df=pd.DataFrame(data=[[53.2296756,21.0122287],[52.406374,16.9241681],[52.2296756,21.0112287],[55.406374,16.9231681]],columns=['LATITUDE','LANGTITUDE'])

dist=[0]
for i in range(1,len(df)):
  dist.append(geopy.distance.vincenty((df.LATITUDE.iloc[i],df.LANGTITUDE.iloc[i]),(df.LATITUDE.iloc[i-1],df.LANGTITUDE.iloc[i-1])).km)

df['distance']=dist
df

enter image description here

Upvotes: 2

Related Questions