Janeson00
Janeson00

Reputation: 113

Getting distance from longitude and latitude using Haversine's distance formula

I am working in a pandas dataframe and I am trying to get the distance from the longitude and latitude for each point for every identifier.

Here's the dataframe currently:

    Identifier       num_pts        latitude          longitude
0   AL011851            3              28.0              -94.8
1   AL011851            3              28.0              -95.4
2   AL011851            3              28.1              -96.0
3   AL021851            2              22.2              -97.6
4   AL021851            2              12.0              -60.0

I know I have to use the Haversine's Distance Formula but I'm not sure how to incorporate it using my data.

import numpy as np
def haversine(lon1, lat1, lon2, lat2, earth_radius=6367):
    """
    Calculate the great circle distance between two points
    on the earth (specified in decimal degrees)

    All args must be of equal length.

    """
    lon1, lat1, lon2, lat2 = map(np.radians, [lon1, lat1, lon2, lat2])

    dlon = lon2 - lon1
    dlat = lat2 - lat1

    a = np.sin(dlat/2.0)**2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon/2.0)**2

    c = 2 * np.arcsin(np.sqrt(a))
    km = earth_radius * c
    return km

Here should be my end result as calculated on paper using just lat and longitude:

 Identifier       num_pts        latitude          longitude            distance
0   AL011851            3              28.0              -94.8            NaN
1   AL011851            3              28.0              -95.4            58.870532
2   AL011851            3              28.1              -96.0            58.870532
3   AL021851            2              22.2              -97.6
4   AL021851            2              12.0              -60.0

EDIT: I need to calculate the distance between consecutive points like 0 and 1, and 2, and it has to be grouped by the identifier to make sure that the points do not come from different identifiers so when theres a new identifier like AL021851 it resets and only computes the points in that identifier

Upvotes: 1

Views: 903

Answers (1)

Andrew Lavers
Andrew Lavers

Reputation: 4378

from io import StringIO
import pandas as pd

# Example data
df = pd.read_fwf(StringIO("""
Identifier       num_pts        latitude          longitude
AL011851            3              28.0              -94.8
AL011851            3              28.0              -95.4
AL011851            3              28.1              -96.0
AL021851            2              22.2              -97.6
AL021851            2              12.0              -60.0
"""), header=1)

# Provided function
import numpy as np
def haversine(lon1, lat1, lon2, lat2, earth_radius=6367):
    """
    Calculate the great circle distance between two points
    on the earth (specified in decimal degrees)

    All args must be of equal length.

    """
    lon1, lat1, lon2, lat2 = map(np.radians, [lon1, lat1, lon2, lat2])

    dlon = lon2 - lon1
    dlat = lat2 - lat1

    a = np.sin(dlat/2.0)**2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon/2.0)**2

    c = 2 * np.arcsin(np.sqrt(a))
    km = earth_radius * c
    return km


# Use pandas shift to place prior values on each row, within a grouped dataframe
dfg = df.groupby("Identifier")
df ['p_latitude'] = dfg['latitude'].shift(1)
df ['p_longitude'] = dfg['longitude'].shift(1)

# Assign to a new column - use pandas dataframe apply to invoke for each row
df['distance'] = df[['p_latitude', 'p_longitude', 'latitude','longitude']].apply(lambda x: haversine(x[1], x[0], x[3], x[2]), axis=1)
print(df)

#  Identifier  num_pts  latitude  longitude  p_latitude  p_longitude     distance
#0   AL011851        3      28.0      -94.8         NaN          NaN          NaN
#1   AL011851        3      28.0      -95.4        28.0        -94.8    58.870532
#2   AL011851        3      28.1      -96.0        28.0        -95.4    59.883283
#3   AL021851        2      22.2      -97.6         NaN          NaN          NaN
#4   AL021851        2      12.0      -60.0        22.2        -97.6  4138.535287

Upvotes: 1

Related Questions