Reputation: 31
I have a column with 'CITY' and 'STATE' strings. I tried using geocoder from geopy library to calculate Latitude and Longitude but it timeouts as it exceeds the number of requests also there are about 85895 rows in the data set. So I did value count for the 'CITY_STATE' column and there are 1340 values. Is there a way I can find Latitude and Longitude for those 1340 values and then map those values with the corresponding 'CITY' and 'STATE'.
from opencage.geocoder import OpenCageGeocode
key = 'my_key' # get api key from : https://opencagedata.com
geocoder = OpenCageGeocode(key)
list_lat = [] # create empty lists
list_long = []
for index, row in train.iterrows(): # iterate over rows in dataframe
City = row['CITY']
State = row['STATE']
query = str(City)+','+str(State)
results = geocoder.geocode(query)
lat = results[0]['geometry']['lat']
long = results[0]['geometry']['lng']
list_lat.append(lat)
list_long.append(long)
# create new columns from lists
train['lat'] = list_lat
test['lon'] = list_long
Upvotes: 0
Views: 591
Reputation: 863116
I think you can use GroupBy.apply
with custom function and joined both columns to Series s
:
s = train['CITY'].astype(str) + ', ' + train['STATE'].astype(str)
def func(x):
results = geocoder.geocode(x.name)
x['lat'] = results[0]['geometry']['lat']
x['lon'] = results[0]['geometry']['lng']
return x
train = train.groupby(s).apply(func)
Upvotes: 1