Nabih Bawazir
Nabih Bawazir

Reputation: 7255

How to calculate haversine distance between 4 columns correctly

I try to calculate haversine distance between 4 columns

cgi                     longitude_bts       latitude_bts    longitude_poi   latitude_poi
0   510-11-32111-7131       95.335142           5.565253        95.337588       5.563713
1   510-11-32111-7135       95.335142           5.565253        95.337588       5.563713

Here's my code

def haversine(lon1, lat1, lon2, lat2):
    """
    Calculate the great circle distance between two points 
    on the earth (specified in decimal degrees)
    """
    # convert decimal degrees to radians
    import numpy as np
    import math
    lon1, lat1, lon2, lat2 = map(np.radians, [lon1, lat1, lon2, lat2])
    # haversine formula 
    dlon = lon2 - lon1 
    dlat = lat2 - lat1 
    a = math.sin(dlat/2)**2 + math.cos(lat1) * math.cos(lat2) * math.sin(dlon/2)**2
    c = 2 * math.asin(math.sqrt(a)) 
    # Radius of earth in kilometers is 6371
    km = 6371008.799485213* c
    return km

ref_location_airport_hospital['radius'] = ref_location_airport_hospital.apply(lambda x: haversine(x['latitude_bts'], x['longitude_bts'], x['latitude_poi'], x['longitude_poi']), axis=1)

Here's the result

    cgi                 longitude_bts      latitude_bts longitude_poi   latitude_poi    radius
0   510-11-32111-7131   95.335142              5.565253     95.337588       5.563713    272.441676
1   510-11-32111-7135   95.335142              5.565253     95.337588       5.563713    272.441676

The result is not rational, the two points distasance is less than 0.004, so the radius should less than 1 km

Note: 1 longitude/latitide is arroun 111 km

Upvotes: 1

Views: 262

Answers (1)

amance
amance

Reputation: 1770

You're getting meters rather than kilometers. Try this:

import pandas as pd
import numpy as np

def haversine(lon1, lat1, lon2, lat2):
    lon1, lat1, lon2, lat2 = np.radians([lon1, lat1, lon2, lat2])
    dlon = lon2 - lon1
    dlat = lat2 - lat1

    haver_formula = np.sin(dlat/2)**2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon/2)**2

    r = 6371 #6371 for distance in KM for miles use 3958.756
    dist = 2 * r * np.arcsin(np.sqrt(haver_formula))
    return pd.Series(dist)

#provided data
df = pd.DataFrame({'cgi': {0: '510-11-32111-7131', 1: '510-11-32111-7135'}, 'longitude_bts': {0: 95.335142, 1: 95.335142}, 'latitude_bts': {0: 5.565253, 1: 5.565253}, 'longitude_poi': {0: 95.337588, 1: 95.337588}, 'latitude_poi': {0: 5.563713, 1: 5.563713}})

df['km'] = haversine(df['longitude_bts'], df['latitude_bts'], df['longitude_poi'], df['latitude_poi'])

#output
    cgi                 longitude_bts      latitude_bts longitude_poi   latitude_poi          km
0   510-11-32111-7131   95.335142              5.565253     95.337588       5.563713    0.320316
1   510-11-32111-7135   95.335142              5.565253     95.337588       5.563713    0.320316

Upvotes: 1

Related Questions