Wu Kris
Wu Kris

Reputation: 85

Error from calculating the distance between points with latitiude and longitude in python

I am trying to calculate the distance (in km) between different geolocations with latitude and longitude. I tried to use the code from this thread: Pandas Latitude-Longitude to distance between successive rows. However, I run into this error:

Does anyone know how to fix this issue?

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
~\Anaconda3\lib\site-packages\pandas\core\generic.py in __getattr__(self, name)
   5464                 return self[name]
-> 5465             return object.__getattribute__(self, name)
   5466 

AttributeError: 'Series' object has no attribute 'radians'

The above exception was the direct cause of the following exception:

TypeError                                 Traceback (most recent call last)
<ipython-input-56-3c590360590e> in <module>
     11 
     12 df['dist'] = haversine(df.latitude.shift(), df.longitude.shift(), 
---> 13                        df.loc[1:, 'latitude'], df.loc[1:, 'longitude'])
     14 
     15 

<ipython-input-56-3c590360590e> in haversine(lat1, lon1, lat2, lon2, to_radians, earth_radius)
      2 def haversine(lat1, lon1, lat2, lon2, to_radians=True, earth_radius=6371):
      3     if to_radians:
----> 4         lat1, lon1, lat2, lon2 = np.radians([lat1, lon1, lat2, lon2])
      5 
      6     a = np.sin((lat2-lat1)/2.0)**2 + \

TypeError: loop of ufunc does not support argument 0 of type Series which has no callable radians method

Here is the data frame:

>>> df_latlon

    latitude    longitude
0   37.405548   -122.078481
1   34.080610   -84.200785
2   37.770830   -122.395463
3   37.773792   -122.409865
4   41.441269   -96.494304
5   41.441269   -96.494304
6   41.441269   -96.494304
7   41.883784   -87.637668
8   26.140780   -80.124434
9   39.960000   -85.983660

Here is the code:

def haversine(lat1, lon1, lat2, lon2, to_radians=True, earth_radius=6371):
    if to_radians:
        lat1, lon1, lat2, lon2 = np.radians([lat1, lon1, lat2, lon2])

    a = np.sin((lat2-lat1)/2.0)**2 + \
        np.cos(lat1) * np.cos(lat2) * np.sin((lon2-lon1)/2.0)**2

    return earth_radius * 2 * np.arcsin(np.sqrt(a))


df_latlon['dist'] = haversine(df_latlon.latitude.shift(), df_latlon.longitude.shift(), 
                       df_latlon.loc[1:, 'latitude'], df_latlon.loc[1:, 'longitude'])


Upvotes: 0

Views: 407

Answers (2)

Jonathan Leon
Jonathan Leon

Reputation: 5648

I think the issue is you want to calculate row by row, but sending the series into the function like doesn't seem to be working.

Try:

data='''
    latitude    longitude
0   37.405548   -122.078481
1   34.080610   -84.200785
2   37.770830   -122.395463
3   37.773792   -122.409865
4   41.441269   -96.494304
5   41.441269   -96.494304
6   41.441269   -96.494304
7   41.883784   -87.637668
8   26.140780   -80.124434
9   39.960000   -85.983660'''
df = pd.read_csv(io.StringIO(data), sep='  \s+', engine='python')
df[['lat2', 'lon2']] = df[['latitude', 'longitude']].shift()


def haversine(lat1, lon1, lat2, lon2, to_radians=True, earth_radius=6371):
    if to_radians:
        lat1, lon1, lat2, lon2 = np.radians([lat1, lon1, lat2, lon2])

    a = np.sin((lat2-lat1)/2.0)**2 + \
        np.cos(lat1) * np.cos(lat2) * np.sin((lon2-lon1)/2.0)**2

    return earth_radius * 2 * np.arcsin(np.sqrt(a))

df_latlon['dist'] = df.apply(lambda x: haversine(x['lat2'], x['lon2'], x['latitude'], x['longitude']), axis=1)

    latitude   longitude       lat2        lon2         dist
0  37.405548 -122.078481        NaN         NaN          NaN
1  34.080610  -84.200785  37.405548 -122.078481  3415.495909
2  37.770830 -122.395463  34.080610  -84.200785  3439.656694
3  37.773792 -122.409865  37.770830 -122.395463     1.307998
4  41.441269  -96.494304  37.773792 -122.409865  2248.480322
5  41.441269  -96.494304  41.441269  -96.494304     0.000000
6  41.441269  -96.494304  41.441269  -96.494304     0.000000
7  41.883784  -87.637668  41.441269  -96.494304   737.041395
8  26.140780  -80.124434  41.883784  -87.637668  1880.578726
9  39.960000  -85.983660  26.140780  -80.124434  1629.746292

Upvotes: 0

pcoates
pcoates

Reputation: 2307

You're passing in a Series to the haversine function rather than a simple number for the lat and lon attributes.

I think you can use the apply function to apply the haversine to each row in the dataframe, however, I'm not too sure what the best way is for apply to be able to get hold of the next or previous row.

So, I'd just add a couple of extra columns 'from lat' and 'from lon'. Then you will have all the data you need on each row.

# add the from lat and lon as extra columns
df_latlon['from lat'] = df_latlon['latitude'].shift(1)
df_latlon['from lon'] = df_latlon['longitude'].shift(1)

def calculate_distance(df_row):
    return haversine(df_row['from lat'], df_row['from lon'], df_row['latitude'], df_row['longitude'])

# pass each row through the haversine function via the calculate_distance
df_latlon['dist'] = df_latlon.apply(calculate_distance, axis=1)

Upvotes: 1

Related Questions