Reputation: 11
I've looked around for a solution and tried filtering my df to where the longitude and latitude are not null but to no avail. This is my first time using geopy package so maybe my error is stemming from that. I have a df that includes long/lat coords and I'm trying to attach city, state and country to each observation. When I limit my df to just the first 10 rows my code works like a charm. When I apply it to the whole df(34,556 observations) I get this error code: 'NoneType' object has no attribute 'raw'.
from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent="geoapiExercises")
df_power_org = pd.read_csv('global_power_plant_database.csv', low_memory=False)
df_power_org = df_power_org[df_power_org.longitude.notnull()]
df_power_org = df_power_org[df_power_org.latitude.notnull()]
def city_state_country(row):
coord = f"{row['latitude']}, {row['longitude']}"
location = geolocator.reverse(coord, exactly_one=True, language='en')
address = location.raw['address']
city = address.get('city', '')
state = address.get('state', '')
country = address.get('country', '')
row['city'] = city
row['state'] = state
row['country2'] = country
return row
df_power_org = df_power_org.apply(city_state_country, axis=1)
Any advice is deeply appreciated.
Upvotes: 1
Views: 2185
Reputation: 13242
From geopy's documentation:
Nominatim
is free, but provides low request limits.
Digging a little deaper, Nominatim
's site states:
No heavy uses (an absolute maximum of 1 request per second).
It's likely that you're being blocked by Nominatim
for excessive requests.
If you want to use Nominatim
and follow their instructions, you can modify your code to pause after each request... this will take about 10 hours to do all 34k requests.
from geopy.geocoders import Nominatim
from time import sleep
geolocator = Nominatim(user_agent="geoapiExercises")
df_power_org = pd.read_csv('global_power_plant_database.csv', low_memory=False)
df_power_org = df_power_org[df_power_org.longitude.notnull()]
df_power_org = df_power_org[df_power_org.latitude.notnull()]
def city_state_country(row):
coord = f"{row['latitude']}, {row['longitude']}"
sleep(1)
location = geolocator.reverse(coord, exactly_one=True, language='en')
if not location:
# if you see many in a row, it's probably Nominatim blocking you.
# if it's just every once in a while, there were just some bad results.
print('Failed with coord: ', coord)
row['city'], row['state'], row['country2'] = None, None, None
return row
address = location.raw['address']
city = address.get('city', '')
state = address.get('state', '')
country = address.get('country', '')
row['city'] = city
row['state'] = state
row['country2'] = country
return row
df_power_org = df_power_org.apply(city_state_country, axis=1)
Upvotes: 2