Reputation: 1193
These are my dataframes:
import pandas as pd
from geopy.distance import geodesic
df1 = pd.DataFrame({'name':['a','b','c','d'], 'lat':[37.51, 41.33,37.51, 41.33], 'long':[71.81, 77.89,71.81, 77.89]})
df2 = pd.DataFrame({'id':[1,2], 'loc_a':['a','c'],'loc_z':['b','d']})
I want to calculate distance between loc_a
and loc_z
by using coordinates from df1
for respective a, b, c, d
values.
In the past I have used this formula but my LONGITUDE and LATITUDE columns would be in same dataframe.
df2['lat_long_diff'] = df2.apply(lambda x : geodesic((x['LATITUDE_first'],x['LONGITUDE_first']),(x['LATITUDE_second'],x['LONGITUDE_second'])).miles,axis=1)
In this case I want LATITUDE and LONGITUDE should be taken from df1
Output :
df2
id loc_a loc_z lat_long_diff
1 a b "value in miles"
2 c d "value in miles"
Upvotes: 1
Views: 148
Reputation: 71687
You can map the labels (a, b, c, d)
in df2
with their respective coordinates (lat, long)
from df1
and then use .apply
to calculate the distance.:
mapping = df1.set_index('name').agg(tuple, axis=1)
coords = df2[['loc_a', 'loc_z']].transform(lambda s: s.map(mapping))
df2['lat_long_diff'] = coords.apply(lambda x : geodesic(x['loc_a'], x['loc_z']).miles, axis=1)
Intermediate steps:
# print(mapping)
name
a (37.51, 71.81)
b (41.33, 77.89)
c (37.51, 71.81)
d (41.33, 77.89)
dtype: object
# print(coords)
loc_a loc_z
0 (37.51, 71.81) (41.33, 77.89)
1 (37.51, 71.81) (41.33, 77.89)
Result:
# print(df2)
id loc_a loc_z lat_long_diff
0 1 a b 418.480212
1 2 c d 418.480212
Upvotes: 0
Reputation: 863226
Use DataFrame.merge
first and then apply
:
df2 = df2.merge(df1.add_suffix('1').rename(columns={'name1':'loc_a'}), on='loc_a', how='left')
df2 = df2.merge(df1.add_suffix('2').rename(columns={'name2':'loc_z'}), on='loc_z', how='left')
print (df2)
id loc_a loc_z lat1 long1 lat2 long2
0 1 a b 37.51 71.81 41.33 77.89
1 2 c d 37.51 71.81 41.33 77.89
df2['lat_long_diff'] = df2.apply(lambda x : geodesic((x['lat1'],x['long1']),(x['lat2'],x['long2'])).miles,axis=1)
Upvotes: 1
Reputation: 3294
Join the two data frames and then you can apply your formula.
Upvotes: 0