Murtaza Haji
Murtaza Haji

Reputation: 1193

access values from two dataframes on common key

These are my dataframes:

import pandas as pd
from geopy.distance import geodesic

df1 = pd.DataFrame({'name':['a','b','c','d'], 'lat':[37.51, 41.33,37.51, 41.33], 'long':[71.81, 77.89,71.81, 77.89]})

df2 = pd.DataFrame({'id':[1,2], 'loc_a':['a','c'],'loc_z':['b','d']})

I want to calculate distance between loc_a and loc_z by using coordinates from df1 for respective a, b, c, d values.

In the past I have used this formula but my LONGITUDE and LATITUDE columns would be in same dataframe.

df2['lat_long_diff'] = df2.apply(lambda x : geodesic((x['LATITUDE_first'],x['LONGITUDE_first']),(x['LATITUDE_second'],x['LONGITUDE_second'])).miles,axis=1)

In this case I want LATITUDE and LONGITUDE should be taken from df1

Output :

df2

id    loc_a     loc_z       lat_long_diff
1      a          b        "value in miles"
2      c          d        "value in miles"

Upvotes: 1

Views: 148

Answers (3)

Shubham Sharma
Shubham Sharma

Reputation: 71687

You can map the labels (a, b, c, d) in df2 with their respective coordinates (lat, long) from df1 and then use .apply to calculate the distance.:

mapping = df1.set_index('name').agg(tuple, axis=1)
coords = df2[['loc_a', 'loc_z']].transform(lambda s: s.map(mapping))
df2['lat_long_diff'] = coords.apply(lambda x : geodesic(x['loc_a'], x['loc_z']).miles, axis=1)

Intermediate steps:

# print(mapping)
name
a    (37.51, 71.81)
b    (41.33, 77.89)
c    (37.51, 71.81)
d    (41.33, 77.89)
dtype: object

# print(coords)
            loc_a           loc_z
0  (37.51, 71.81)  (41.33, 77.89)
1  (37.51, 71.81)  (41.33, 77.89)

Result:

# print(df2)
  id loc_a loc_z  lat_long_diff
0   1     a     b     418.480212
1   2     c     d     418.480212

Upvotes: 0

jezrael
jezrael

Reputation: 863226

Use DataFrame.merge first and then apply:

df2 = df2.merge(df1.add_suffix('1').rename(columns={'name1':'loc_a'}), on='loc_a', how='left') 
df2 = df2.merge(df1.add_suffix('2').rename(columns={'name2':'loc_z'}), on='loc_z', how='left') 
print (df2)
   id loc_a loc_z   lat1  long1   lat2  long2
0   1     a     b  37.51  71.81  41.33  77.89
1   2     c     d  37.51  71.81  41.33  77.89

df2['lat_long_diff'] = df2.apply(lambda x : geodesic((x['lat1'],x['long1']),(x['lat2'],x['long2'])).miles,axis=1)

Upvotes: 1

paradocslover
paradocslover

Reputation: 3294

Join the two data frames and then you can apply your formula.

Upvotes: 0

Related Questions