deepankar srigyan
deepankar srigyan

Reputation: 21

Distance between two places based on longitude and latitude

I have a Python dataframe like attached in the picture:

enter image description here

where post codes are the actual post codes and their longitude and latitude, I am trying to calculate the distance from postcode_x to postcode_y

dataframe format

I wrote a Python function:

def distance(lat_1,lon_1,lat_2,lon_2):
R = 6373.0
# radius of the Earth


lat1 = math.radians(lat_1)
# coordinates

lon1 = math.radians(lon_1)
lat2 = math.radians(lat_1)
lon2 = math.radians(lon_2)

dlon = lon2 - lon1
# change in coordinates

dlat = lat2 - lat1

a = math.sin(dlat / 2)**2 + math.cos(lat1) * math.cos(lat2) * math.sin(dlon / 2)**2
# Haversine formula

c = 2 * math.atan2(math.sqrt(a), math.sqrt(1 - a))
distance = R * c

This works fine when I call it

lat_1 =52.2296756
lon_1 =21.0122287
lat_2 = 52.406374
lon_2 = 16.9251681
distance(lat_1,lon_1,lat_2,lon_2)
Ans is 278.40645089544114

however, when I try to feed this in a new column of the DataFrame

result['distance']=distance(result['LATITUDE_x'],result['LONGITUDE_x'],result['LATITUDE_y'],result['LONGITUDE_y'])

it shows the error:

TypeError: cannot convert the series to <class 'float'> 

 TypeError: cannot convert the series to <class 'float'> 
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-56-44558335aa06> in <module>
----> 1 result['distance']=distance(result['LATITUDE_x'].astype(np.float),result['LONGITUDE_x'].astype(np.float),result['LATITUDE_y'].astype(np.float),result['LONGITUDE_y'].astype(np.float))
      2 result

<ipython-input-53-4dddb160b896> in distance(lat_1, lon_1, lat_2, lon_2)
      4 
      5 
----> 6     lat1 = math.radians(lat_1)
      7     # coordinates
      8 

c:\python\python 3.95\lib\site-packages\pandas\core\series.py in wrapper(self)
    139         if len(self) == 1:
    140             return converter(self.iloc[0])
--> 141         raise TypeError(f"cannot convert the series to {converter}")
    142 
    143     wrapper.__name__ = f"__{converter.__name__}__"

TypeError: cannot convert the series to <class 'float'>

I tried:

  1. result['distance']=distance(result['LATITUDE_x'].astype(np.float32),result['LONGITUDE_x'].astype(np.float32),result['LATITUDE_y'].astype(np.float32),result['LONGITUDE_y'].astype(np.float32))

  2. instead of np.float32, I put astype(float) all are showing same error.

Upvotes: 0

Views: 255

Answers (1)

Ivan De Paz Centeno
Ivan De Paz Centeno

Reputation: 3785

The problem is that your distance() function does not support vectorized operations, thus you can't apply it to vectors, only scalars.

In order to solve it, you have two options: apply the function row-wise (perhaps using df.apply()) or to vectorize your function by using numpy (best approach):

import numpy as np

def distance(lat_1,lon_1,lat_2,lon_2):
    R = 6373.0
    # radius of the Earth


    lat1 = np.radians(lat_1)
    # coordinates

    lon1 = np.radians(lon_1)
    lat2 = np.radians(lat_1)
    lon2 = np.radians(lon_2)

    dlon = lon2 - lon1
    # change in coordinates

    dlat = lat2 - lat1

    a = np.sin(dlat / 2)**2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon / 2)**2
    # Haversine formula

    c = 2 * np.arctan2(np.sqrt(a), np.sqrt(1 - a))
    distance = R * c
    return distance

Upvotes: 1

Related Questions