Reputation: 104
I have a dataframe(df3)
df3 = pd.DataFrame({
'Origin':['DEL','BOM','AMD'],
'Destination':['BOM','AMD','DEL']})
comprising of Travel Data which contains Origin/Destination and I'm trying to map Latitude and Longitude for Origin & Destination airports using 3 letter city codes (df_s3).
df_s3 = pd.DataFrame({
'iata_code':['AMD','BOM','DEL'],
'Lat':['72.6346969603999','72.8678970337','77.103104'],
'Lon':['23.0771999359','19.0886993408','28.5665']})
I've tried mapping them one at a time, i.e.
df4=pd.merge(left=df3,right=df_s3,how='left',left_on=['Origin'],right_on=['iata_code'],suffixes=['_origin','_origin'])
df5=pd.merge(left=df4,right=df_s3,how='left',left_on=['Destination'],right_on=['iata_code'],suffixes=['_destination','_destination'])
This updates the values in the dataframe but the columns corresponding to origin lat/long have '_destination' as the suffix
I've even taken an aspirational long shot by combining the two, i.e.
df4=pd.merge(left=df3,right=df_s3,how='left',left_on=['Origin','Destination'],right_on=['iata_code','iata_code'],suffixes=['_origin','_destination'])
Both of these dont seem to be working. Any suggestions on how to make it work in a larger dataset while keeping the processing time low.
Upvotes: 1
Views: 181
Reputation: 53
You can try to apply to each column a function like this one:
def from_place_to_coord(place: str):
if place in df_s3['iata_code'].to_list():
Lat = df_s3[df_s3['iata_code'] == place]['Lat'].values[0]
Lon = df_s3[df_s3['iata_code'] == place]['Lon'].values[0]
return Lat, Lon
else:
print('Not found')
and then:
df3['origin_loc'] = df3['Origin'].apply(from_place_to_coord)
df3['destination_loc'] = df3['Destination'].apply(from_place_to_coord)
It will return you 2 more columns with a tuple of Lat,Lon according to the location
Upvotes: 0
Reputation: 6526
Your solution was almost correct. But you need to specify the origin suffix in the second merge:
df4=pd.merge(left=df3,
right=df_s3,how='left',
left_on=['Origin'],
right_on=['iata_code'])
df5=pd.merge(left=df4,
right=df_s3,how='left',
left_on=['Destination'],
right_on=['iata_code'],
suffixes=['_origin', '_destination'])
In the first merge you don't need to specify any suffix as there is no overlap. In the second merge you need to specify the suffix for the right side and the left side. The right side is the longitude and latitude from the origin and the left side are from the destination.
Upvotes: 1